Hi Jessie, In regards to you seeing 370 objects with errors form ‘zpool status’, but having over 400 files with “access issues”, I would suggest running the ‘zpool scrub’ to identify all the ZFS objects in the pool that are reporting permanent errors.
It would be very important to have a complete list of files w/issues, before replicating the VDEV(s) in question. You may also want to dump the zdb information for the source VDEV(s) with the following: zdb -dddddd source_pool/source_vdev > /some/where/with/room For example, if the zpool was named pool-01, and the VDEV was named lustre-0001 and you had free space in a filesystem named /home: zdb -dddddd pool-01/lustre-0001 > /home/zdb_pool-01_0001_20161212.out There is a great wealth of data zdb can share about your files. Having the output may prove helpful down the road. Thanks, Tom > On Dec 12, 2016, at 4:39 PM, Jesse Stroik <[email protected]> wrote: > > Thanks for taking the time to respond, Tom, > > >> For clarification, it sounds like you are using hardware based RAID-6, and >> not ZFS raid? Is this correct? Or was the faulty card simply an HBA? > > > You are correct. This particular file system is still using hardware RAID6. > > >> At the bottom of the ‘zpool status -v pool_name’ output, you may see paths >> and/or zfs object ID’s of the damaged/impacted files. This would be good to >> take note of. > > > Yes, I output this to files at a few different times and we've had no chance > since replacing the RAID controller, which makes me feel reasonably > comfortable leaving the file system in production. > > There are 370 objects listed by zpool status -v but I am unable to access at > least 400 files. Almost all of our files are single stripe. > > >> Running a ‘zpool scrub’ is a good idea. If the zpool is protected with "ZFS >> raid", the scrub may be able to repair some of the damage. If the zpool is >> not protected with "ZFS raid", the scrub will identify any other errors, but >> likely NOT repair any of the damage. > > > We're not protected with ZFS RAID, just hardware raid6. I could run a patrol > on the hardware controller and then a ZFS scrub if that makes the most sense > at this point. This file system is scheduled to run a scrub the third week of > every month so it would run one this weekend otherwise. > > > >> If you have enough disk space on hardware that is behaving properly (and >> free space in the source zpool), you may want to replicate the VDEV’s (OST) >> that are reporting errors. Having a replicated VDEV can afford you the >> ability to examine the data without fear of further damage. You may also >> want to extract certain files from the replicated VDEV(s) which are >> producing IO errors on the source VDEV. >> >> Something like this for replication should work: >> >> zfs snap source_pool/source_ost@timestamp_label >> zfs send -Rv source_pool/source_ost@timestamp_label | zfs receive >> destination_pool/source_oat_replicated >> >> You will need to set zfs_send_corrupt_data to 1 in >> /sys/module/zfs/parameters or the ‘zfs send’ will error and fail when >> sending a VDEV with read and/or checksum errors. >> Enabling zfs_send_corrupt_data allows the zfs send operation to complete. >> Any blocks that are damaged on the source side, will have “x2f5baddb10c” >> replaced in the bad blocks on the destination side. This can be helpful in >> troubleshooting if an entire file is corrupt, or parts of the file. >> >> After the replication, you should set the replicated VDEV to read only with >> ‘zfs set readonly=on destination_pool/source_ost_replicated’ >> > > Thank you for this suggestion. We'll most likely do that. > > Best, > Jesse Stroik > _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
