Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
Another update to my post: it took a week to try running some ZDB walks on my pool, but they coredumped after a while. However, I've also noticed some clues in my FMADM outputs dating from 'zpool scrub' attempts. There are several sets (one set per scrub) of similar error reports, differing only in timestamps, "ena" and "__tod" fields, as far as I could see. I hope someone can glance over them and point me in the right direction - i.e. what on-disk block I might want to extract for analysis and/or forge-and-replace, to fix my pool into a condition where no errors are reported... As a reminder, this is a 6-disk raidz2 pool with ashift=12. During a recent scrub there were 0 on-disk errors, with one pool-level error and two vdev-level errors. The pool and vdevs are considered online, and there are no errors noticed during pool usage (however I didn't intentionally write anything to it afterwards, it was only RW mounted for a few bootups), but a metadata error is being reported: NAME STATE READ WRITE CKSUM pool ONLINE 0 0 1 raidz2-0 ONLINE 0 0 2 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x0> Below goes one set of such FMADM reports. There are two different zio_offset values, 4 lines of 0x6ecb164000 and 8 lines of 0x6ecb163000, both sized 0x8000. I believe this might be the address of the mismatching block(s); but now - how do I locate them on-disk to try and match/analyze/forge/etc? The offset is probably relative to the pool? Or to the disks? Any ideas? ;) Dec 01 2011 08:53:29.43177 ereport.fs.zfs.data nvlist version: 0 class = ereport.fs.zfs.data ena = 0x7a43652cd2c00401 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x1638b976389d447c (end detector) pool = pool pool_guid = 0x1638b976389d447c pool_context = 0 pool_failmode = continue zio_err = 50 zio_objset = 0x0 zio_object = 0x0 zio_level = 0 zio_blkid = 0x0 __ttl = 0x1 __tod = 0x4ed70849 0x1a17f319 Dec 01 2011 08:53:29.437774722 ereport.fs.zfs.checksum nvlist version: 0 class = ereport.fs.zfs.checksum ena = 0x7a43652cd2c00401 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x1638b976389d447c vdev = 0x6abe377b60fc48e5 (end detector) pool = pool pool_guid = 0x1638b976389d447c pool_context = 0 pool_failmode = continue vdev_guid = 0x6abe377b60fc48e5 vdev_type = disk vdev_path = /dev/dsk/c7t1d0s0 vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD1XWWB/a parent_guid = 0x53d15735fa4c6d21 parent_type = raidz zio_err = 50 zio_offset = 0x6ecb164000 zio_size = 0x8000 zio_objset = 0x0 zio_object = 0x0 zio_level = 0 zio_blkid = 0x0 __ttl = 0x1 __tod = 0x4ed70849 0x1a17e982 Dec 01 2011 08:53:29.437774091 ereport.fs.zfs.checksum nvlist version: 0 class = ereport.fs.zfs.checksum ena = 0x7a43652cd2c00401 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x1638b976389d447c vdev = 0xa5d72d5c3c698a85 (end detector) pool = pool pool_guid = 0x1638b976389d447c pool_context = 0 pool_failmode = continue vdev_guid = 0xa5d72d5c3c698a85 vdev_type = disk vdev_path = /dev/dsk/c7t0d0s0 vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD217ZL/a parent_guid = 0x53d15735fa4c6d21 parent_type = raidz zio_err = 50 zio_offset = 0x6ecb164000 zio_size = 0x8000 zio_objset = 0x0 zio_object = 0x0 zio_level = 0 zio_blkid = 0x0 __ttl = 0x1 __tod = 0x4ed70849 0x1a17e70b Dec 01 2011 08:53:29.437772910 ereport.fs.zfs.checksum nvlist version: 0 class = ereport.fs.zfs.checksum ena = 0x7a43652cd2c00401 detector = (embedded nvlist) nvlist version: 0 version = 0x0 scheme = zfs pool = 0x1638b976389d447c vdev = 0x395cf609609d8846 (end detector) pool = pool pool_guid = 0x1638b976389d447c pool_context = 0 pool_failmode = continue vdev_guid = 0x395cf609609d8846 vdev_type = disk vdev_path = /dev/dsk/c7t5d0s0 vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD24GDG/a parent_guid = 0x53d15735fa4c6d21 parent_type = raidz zio_err
Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
On Mon, Dec 5, 2011 at 17:46, Jim Klimov wrote: > So, in contrast with Nigel's optimistic theory that > metadata is anyway extra-redundant and should be > easily fixable, it seems that I do still have the > problem. It does not show itself in practice as of > yet, but is found by scrub ;) Hmm. Interesting. I have re-scrubbed pool that I referenced. I haven't gotten another error in the metadata. > After a few days to complete the current scrub, > I plan to run zdb as asked by Steve. If anyone else > has some theories, suggestions or requests to dig > up more clues - bring them on! ;) > Perhaps the cause of the corruption is still active. The circumstances that lead up to the discovery of the error are different for you and me. The server that I encountered it on was running fine for months, it was only after the crash/hang because of attempting to add a bad drive to the pool that I encountered the issue which was found on boot in my case. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
Well, I have an intermediate data point. One scrub run completed without finding any newer errors (beside one at the pool-level and two and the raidz2-level). "Zpool clear" alone did not fix it, meaning that the pool:metadata:<0x0> was still reported as problematic, but a second attempt at "zpool clear" did clear the errors from "zpool status". Before running "zdb" as asked by other commentors, I decided to rescrub. At some point between 200Gb and 1.7Tb scanned, the errors returned to the stats. So, in contrast with Nigel's optimistic theory that metadata is anyway extra-redundant and should be easily fixable, it seems that I do still have the problem. It does not show itself in practice as of yet, but is found by scrub ;) After a few days to complete the current scrub, I plan to run zdb as asked by Steve. If anyone else has some theories, suggestions or requests to dig up more clues - bring them on! ;) 2011-12-02 20:08, Nigel W wrote: On Fri, Dec 2, 2011 at 02:58, Jim Klimov wrote: My question still stands: is it possible to recover from this error or somehow safely ignore it? ;) I mean, without backing up data and recreating the pool? If the problem is in metadata but presumably the pool still works, then this particular metadata is either not critical or redundant, and somehow can be forged and replaced by valid metadata. Is this a rightful path of thought? Are there any tools to remake such a metadata block? Again, I did not try to export/reimport the pool yet, except for that time 3 days ago when the machine hung, was reset and imported the pool and continued the scrub automatically... I think it is now too late to do an export and a rollback import, too... Unfortunately I cannot provide you with a direct answer as I have only been a user of ZFS for about a year and in that time only encountered this once. Anecdotally, at work I had something similar happen to a Nexcenta Core 3.0 (b134) box three days ago (seemingly caused by a hang then eventual panic as a result of attempting to add a drive that is having read failures to the pool). When the box came back up, zfs reported an error in metadata:0x0. We scrubbed the tank (~400GB used) and like in your case the checksum error didn't clear. We ran a scrub again and it seems that the second scrub did clear the metadata error. I don't know if that means it will work that way for everyone, every time, or not. But considering that the pool and the data on it appears to be fine (just not having any replicas until we get the bad disk replaced) and that all metadata is supposed to have+1 copies (with an apparent max of 3 copies[1]) on the pool at all times I can't see why this error shouldn't be cleared by a scrub. [1] http://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
On Fri, Dec 2, 2011 at 02:58, Jim Klimov wrote: > My question still stands: is it possible to recover > from this error or somehow safely ignore it? ;) > I mean, without backing up data and recreating the > pool? > > If the problem is in metadata but presumably the > pool still works, then this particular metadata > is either not critical or redundant, and somehow > can be forged and replaced by valid metadata. > Is this a rightful path of thought? > > Are there any tools to remake such a metadata > block? > > Again, I did not try to export/reimport the pool > yet, except for that time 3 days ago when the > machine hung, was reset and imported the pool > and continued the scrub automatically... > > I think it is now too late to do an export and > a rollback import, too... > Unfortunately I cannot provide you with a direct answer as I have only been a user of ZFS for about a year and in that time only encountered this once. Anecdotally, at work I had something similar happen to a Nexcenta Core 3.0 (b134) box three days ago (seemingly caused by a hang then eventual panic as a result of attempting to add a drive that is having read failures to the pool). When the box came back up, zfs reported an error in metadata:0x0. We scrubbed the tank (~400GB used) and like in your case the checksum error didn't clear. We ran a scrub again and it seems that the second scrub did clear the metadata error. I don't know if that means it will work that way for everyone, every time, or not. But considering that the pool and the data on it appears to be fine (just not having any replicas until we get the bad disk replaced) and that all metadata is supposed to have +1 copies (with an apparent max of 3 copies[1]) on the pool at all times I can't see why this error shouldn't be cleared by a scrub. [1] http://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
2011-12-02 18:25, Steve Gonczi пишет: Hi Jim, Try to run a "zdb -b poolname" .. This should report any leaked or double allocated blocks. (It may or may not run, it tends to run out of memory and crash on large datasets) I would be curious what zdb reports, and whether you are able to run it w/o crashing with "out of memory". Ok, when/if it completes scrubbing the pool, I'll try that. But it is likely to fail, unless there are some new failsafe workarounds for such failures in oi_151a. In the meanwhile, here are copies of zdb walks which I did a couple of weeks ago while repairing (finally replacing) the rpool on this box. At that time it was booted with oi_148a LiveUSB. Some of the walks (those WITH leak-checks not disabled) never completed: root@openindiana:~# time zdb -bb -e 1601233584937321596 Traversing all blocks to verify nothing leaked ... (box hung: LAN Disconnected; RAM/SWAP used up according to "vmstat 1") root@openindiana:~# time zdb -bsvc -e 1601233584937321596 Traversing all blocks to verify checksums and verify nothing leaked ... Assertion failed: zio_wait(zio_claim(0L, zcb->zcb_spa, refcnt ? 0 : spa_first_txg(zcb->zcb_spa), bp, 0L, 0L, ZIO_FLAG_CANFAIL)) == 0 (0x2 == 0x0), file ../zdb.c, line 1950 Abort real7197m41.288s user291m39.256s sys 25m48.133s This took most of the week just to fail. And a walk without leak checks took half a day to find some discrepancies and "unreachable" blocks: root@openindiana:~# time zdb -bsvL -e 1601233584937321596 Traversing all blocks ... block traversal size 9044729487360 != alloc 9044729499648 (unreachable 12288) bp count:85245222 bp logical:8891466103808 avg: 104304 bp physical: 7985508591104 avg: 93676 compression: 1.11 bp allocated: 12429007810560 avg: 145802 compression: 0.72 bp deduped:3384278323200ref>1: 13909855 deduplication: 1.27 SPA allocated: 9044729499648 used: 75.64% Blocks LSIZE PSIZE ASIZE avgcomp %Total Type - - - - - -- unallocated 232K 4K 72.0K 36.0K8.00 0.00 object directory 3 1.50K 1.50K108K 36.0K1.00 0.00 object array 232K 2.50K 72.0K 36.0K 12.80 0.00 packed nvlist - - - - - -- packed nvlist size 7.80K 988M208M 1.12G147K4.75 0.01 bpobj - - - - - -- bpobj header - - - - - -- SPA space map header 183K 753M517M 6.49G 36.3K1.46 0.06 SPA space map 22 1020K 1020K 1.58M 73.6K1.00 0.00 ZIL intent log 933K 14.6G 3.11G 25.2G 27.6K4.69 0.22 DMU dnode 1.75K 3.50M896K 42.0M 24.0K4.00 0.00 DMU objset - - - - - -- DSL directory 390 243K200K 13.7M 36.0K1.21 0.00 DSL directory child map 388 298K208K 13.6M 36.0K1.43 0.00 DSL dataset snap map 715 10.2M 1.14M 25.1M 36.0K8.92 0.00 DSL props - - - - - -- DSL dataset - - - - - -- ZFS znode - - - - - -- ZFS V0 ACL 76.1M 8.06T 7.25T 11.2T150K1.1198.67 ZFS plain file 2.17M 2.76G 1.33G 52.7G 24.3K2.08 0.46 ZFS directory 341 314K171K 7.99M 24.0K1.84 0.00 ZFS master node 857 25.5M 1.16M 20.1M 24.1K 21.94 0.00 ZFS delete queue - - - - - -- zvol object - - - - - -- zvol prop - - - - - -- other uint8[] - - - - - -- other uint64[] - - - - - -- other ZAP - - - - - -- persistent error log 33 4.02M763K 4.46M139K5.39 0.00 SPA history - - - - - -- SPA history offsets 1512 512 36.0K 36.0K1.00 0.00 Pool properties - - - - - -- DSL permissions 17.1K 12.7M 8.63M411M 24.0K1.48 0.00 ZFS ACL - - - - - -- ZFS SYSACL 5 80.0K 5.00K120K 24.0K 16.00 0.00 FUID table - - - - - -- FUID table size 1.37K 723K705K 49.3M 36.0K1.03 0.00 DSL dataset next clones - - - - - -- scan work queue 2.69K 2.57M 1.36M 64.6M 24.0K1.89 0.00 ZFS user/group used - - - - - -- ZFS user/group quota - -
Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...
An intermediate update to my recent post: 2011-11-30 21:01, Jim Klimov wrote: Hello experts, I've finally upgraded my troublesome oi-148a home storage box to oi-151a about a week ago (using pkg update method from the wiki page - i'm not certain if that repository is fixed at release version or is a sliding "current" one). After the OS upgrade i scrubbed my main pool - 6disk raidz2 - and some checksum errors were discovered on individual disks, with one non-correctable error on the raid level. It named a file which was indeed not readable (io errors) so i deleted it. The dataset pool/media has no snapshots, and dedup was disabled on it, so i hoped the error is gone. I cleared the errors (this only zeroed the counters, but still complained that there were some metadata errors in pool/media:0x4) and reran the scrub. While the scrub was running, zpool status reported this error and metadata:0x0. The computer got hung and reset during the scrub, but apparently resumed from the same spot. When the operation completed, however, it had zero checksum errors at both disk and raid levels, the pool/media error was gone, but metadata:0x0 error is still in place. Searching the list archive i found a similar post relevant to snv134 and 135, and at that time Victor Latushkin suggested that the pool must be recreated. I have some unique data on the pool, so i'm reluctant to recreate it (besides, it's problematic to back up 10tb of data at home, and it can take weeks to try and upload it to my work - even if there were so much free space there, which is not). So far i cleared the errors and started a new scrub. I kinda hope that if the box won't hang, it might discover that there are no actual errors indeed. I'll see that in about 100 hours. The pool is now imported and automounted, and i didn't yet try to export and reimport it. The scrub is running slower this time, for a couple of days now and only nearing 25% completion (last timings were 89 and 101 hours). However it seems to have confirmed some raidz-/pool-level checksum errors (without known individual disk errors); whar puzzles me more - there are 2 raidz-level errors for the one pool-level error: # zpool status -v pool: pool state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scan: scrub in progress since Wed Nov 30 19:38:47 2011 1.97T scanned out of 8.34T at 13.6M/s, 135h54m to go 0 repaired, 23.68% done config: NAMESTATE READ WRITE CKSUM poolONLINE 0 0 1 raidz2-0 ONLINE 0 0 2 c7t0d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t2d0 ONLINE 0 0 0 c7t3d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 cache c4t1d0p7 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: :<0x0> My question still stands: is it possible to recover from this error or somehow safely ignore it? ;) I mean, without backing up data and recreating the pool? If the problem is in metadata but presumably the pool still works, then this particular metadata is either not critical or redundant, and somehow can be forged and replaced by valid metadata. Is this a rightful path of thought? Are there any tools to remake such a metadata block? Again, I did not try to export/reimport the pool yet, except for that time 3 days ago when the machine hung, was reset and imported the pool and continued the scrub automatically... I think it is now too late to do an export and a rollback import, too... Still, i'd like to estimate now what are my chances of living on without recreating the pool nor losing data? Perhaps, some ways to actually check, fix or forge the needed metadata? Also, previously a zdb walk found some inconsistencies (allocated !- referred); can that be better diagnosed or repaired? Can this discrepancy by a few sectors worth of size be a cause or be caused by that reported metadata error? Thanks, // Jim Klimov sent from a mobile, pardon any typos ,) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss