[lustre-discuss] Error on a zpool underlying an OST

Kevin Abbey Mon, 11 Jul 2016 21:02:27 -0700

Hi,

Can anyone advise how to clean up 1000s of zfs level permanent errorsand the lustre level too?


A similar question was presented on the list but I did not see an answer.
https://www.mail-archive.com/[email protected]/msg12454.html

As I was testing new hardware I discovered an LSI HBA was bad. On asingle combined MDS/OSS there were 8 OSTs split across 2 jbod and 2 LSIHBA. The mdt was on a 3rd jbod downlinked on the jbod connected withthe bad controller. The zpools connected to the good HBA were scrubedclean after unmounting and stopping lustre. The zpools on the badcontroller continued to have errors while connected to the badcontroller. One of these OSTs reported a disk failure during the scruband began resilvering yet autoreplace was off. This is a very badevent considering the card was causing all of the errors. Neither ascrub or resilver would ever complete. I stopped the scrub on the 3other osts and detached the spare from the ost in resilver process.After narrowing down the bad HBA (initially it was not clear if cablesor jbod backplanes were bad), I use the good HBA to scrub the jbod 1again, then shutdown disconnected the jbod1. Then proceeded to connectthe jbod2 to the good controller to scrub the jbod 2 zpools which hadpreviously been attached to the bad LSI controller. The 3 zpools whichhad scrub stopped previously did complete successfully. The one whichhad begun resilvering began again to resilver after I initiated areplace of the failed disk with the spare. The resilver completed butmany permanent errors were discovered on the zpool. Since this is atest pool I was interested to know if zfs would recover. In a realscenario with HW problems I'll shutdown and disconnect the data drivesprior to HW testing.

The status listed below shows a new scrub in process after the resilvercompleted. The cache drive is missing because the 3rd jbod isdisconnected temporarily.



===================================

ZFS:   v0.6.5.7-1
lustre 2.8.55
kernel 2.6.32_642.1.1.el6.x86_64.x86_64
Centos 6.8


===================================
  ~]# zpool status -v test-ost4
  pool: test-ost4
 state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
    entire pool from backup.
   see: http://zfsonlinux.org/msg/ZFS-8000-8A
  scan: scrub in progress since Mon Jul 11 22:29:09 2016
    689G scanned out of 12.4T at 711M/s, 4h49m to go
    40K repaired, 5.41% done
config:

    NAME                                       STATE READ WRITE CKSUM
    test-ost4                                  ONLINE 0     0   180
      raidz2-0                                 ONLINE 0     0   360

ata-ST4000NM0033-9ZM170_Z1Z7GYXY ONLINE 0 0 2(repairing)ata-ST4000NM0033-9ZM170_Z1Z7KKPQ ONLINE 0 0 3(repairing)ata-ST4000NM0033-9ZM170_Z1Z7L5E7 ONLINE 0 0 3(repairing)ata-ST4000NM0033-9ZM170_Z1Z7KGQT ONLINE 0 0 0(repairing)ata-ST4000NM0033-9ZM170_Z1Z7LA8K ONLINE 0 0 4(repairing)ata-ST4000NM0033-9ZM170_Z1Z7KB0X ONLINE 0 0 3(repairing)ata-ST4000NM0033-9ZM170_Z1Z7JSMN ONLINE 0 0 2(repairing)ata-ST4000NM0033-9ZM170_Z1Z7KXRA ONLINE 0 0 2(repairing)ata-ST4000NM0033-9ZM170_Z1Z7MLSN ONLINE 0 0 2(repairing)ata-ST4000NM0033-9ZM170_Z1Z7L4DT ONLINE 0 0 7(repairing)

    cache
      ata-D2CSTK251M20-0240_A19CV011227000092  UNAVAIL 0     0     0

errors: Permanent errors have been detected in the following files:

        test-ost4/test-ost4:<0xe00>
        test-ost4/test-ost4:<0xe01>
        test-ost4/test-ost4:<0xe02>
        test-ost4/test-ost4:<0xe03>
        test-ost4/test-ost4:<0xe04>
        test-ost4/test-ost4:<0xe05>
        test-ost4/test-ost4:<0xe06>.......
    .......
    .......continues......
    .......
    .......
        test-ost4/test-ost4:<0xdfe>
        test-ost4/test-ost4:<0xdff>
===================================

Follow up questions,

Is is better to not have a spare attached to the pool to preventresilvering in this scenario? (bad HBA, disk failed during scrub,resilver began, yet auto relplace was off. The spare was assigned tothe zpool.)

In a dual path to the jbod would the bad HBA card be disabledautomatically to prevent IO errors reaching the disk? The current setupis single path only.



Thank you for any notes in advance,
Kevin

--
Kevin Abbey
Systems Administrator
Center for Computational and Integrative Biology (CCIB)
http://ccib.camden.rutgers.edu/

Rutgers University - Science Building
315 Penn St.
Camden, NJ 08102
Telephone: (856) 225-6770
Fax:(856) 225-6312
Email: [email protected]

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Error on a zpool underlying an OST

Reply via email to