Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...

2011-12-17 Thread Jim Klimov

Another update to my post: it took a week to try running
some ZDB walks on my pool, but they coredumped after a while.

However, I've also noticed some clues in my FMADM outputs
dating from 'zpool scrub' attempts. There are several sets
(one set per scrub) of similar error reports, differing only
in timestamps, ena and __tod fields, as far as I could see.

I hope someone can glance over them and point me in the right
direction - i.e. what on-disk block I might want to extract
for analysis and/or forge-and-replace, to fix my pool into
a condition where no errors are reported...

As a reminder, this is a 6-disk raidz2 pool with ashift=12.
During a recent scrub there were 0 on-disk errors, with
one pool-level error and two vdev-level errors. The pool
and vdevs are considered online, and there are no errors
noticed during pool usage (however I didn't intentionally
write anything to it afterwards, it was only RW mounted
for a few bootups), but a metadata error is being reported:

NAME   STATE READ WRITE CKSUM
pool   ONLINE 0 0 1
  raidz2-0 ONLINE 0 0 2
c7t0d0 ONLINE 0 0 0
c7t1d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
c7t4d0 ONLINE 0 0 0
c7t5d0 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
metadata:0x0

Below goes one set of such FMADM reports. There are two
different zio_offset values, 4 lines of 0x6ecb164000 and
8 lines of 0x6ecb163000, both sized 0x8000. I believe
this might be the address of the mismatching block(s);
but now - how do I locate them on-disk to try and
match/analyze/forge/etc?

The offset is probably relative to the pool? Or to the
disks? Any ideas? ;)

Dec 01 2011 08:53:29.43177 ereport.fs.zfs.data
nvlist version: 0
class = ereport.fs.zfs.data
ena = 0x7a43652cd2c00401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x1638b976389d447c
(end detector)

pool = pool
pool_guid = 0x1638b976389d447c
pool_context = 0
pool_failmode = continue
zio_err = 50
zio_objset = 0x0
zio_object = 0x0
zio_level = 0
zio_blkid = 0x0
__ttl = 0x1
__tod = 0x4ed70849 0x1a17f319

Dec 01 2011 08:53:29.437774722 ereport.fs.zfs.checksum
nvlist version: 0
class = ereport.fs.zfs.checksum
ena = 0x7a43652cd2c00401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x1638b976389d447c
vdev = 0x6abe377b60fc48e5
(end detector)

pool = pool
pool_guid = 0x1638b976389d447c
pool_context = 0
pool_failmode = continue
vdev_guid = 0x6abe377b60fc48e5
vdev_type = disk
vdev_path = /dev/dsk/c7t1d0s0
vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD1XWWB/a
parent_guid = 0x53d15735fa4c6d21
parent_type = raidz
zio_err = 50
zio_offset = 0x6ecb164000
zio_size = 0x8000
zio_objset = 0x0
zio_object = 0x0
zio_level = 0
zio_blkid = 0x0
__ttl = 0x1
__tod = 0x4ed70849 0x1a17e982

Dec 01 2011 08:53:29.437774091 ereport.fs.zfs.checksum
nvlist version: 0
class = ereport.fs.zfs.checksum
ena = 0x7a43652cd2c00401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x1638b976389d447c
vdev = 0xa5d72d5c3c698a85
(end detector)

pool = pool
pool_guid = 0x1638b976389d447c
pool_context = 0
pool_failmode = continue
vdev_guid = 0xa5d72d5c3c698a85
vdev_type = disk
vdev_path = /dev/dsk/c7t0d0s0
vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD217ZL/a
parent_guid = 0x53d15735fa4c6d21
parent_type = raidz
zio_err = 50
zio_offset = 0x6ecb164000
zio_size = 0x8000
zio_objset = 0x0
zio_object = 0x0
zio_level = 0
zio_blkid = 0x0
__ttl = 0x1
__tod = 0x4ed70849 0x1a17e70b

Dec 01 2011 08:53:29.437772910 ereport.fs.zfs.checksum
nvlist version: 0
class = ereport.fs.zfs.checksum
ena = 0x7a43652cd2c00401
detector = (embedded nvlist)
nvlist version: 0
version = 0x0
scheme = zfs
pool = 0x1638b976389d447c
vdev = 0x395cf609609d8846
(end detector)

pool = pool
pool_guid = 0x1638b976389d447c
pool_context = 0
pool_failmode = continue
vdev_guid = 0x395cf609609d8846
vdev_type = disk
vdev_path = /dev/dsk/c7t5d0s0
vdev_devid = id1,sd@SATA_ST2000DL003-9VT15YD24GDG/a
parent_guid = 0x53d15735fa4c6d21
parent_type = raidz

Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...

2011-12-08 Thread Nigel W
On Mon, Dec 5, 2011 at 17:46, Jim Klimov jimkli...@cos.ru wrote:
 So, in contrast with Nigel's optimistic theory that
 metadata is anyway extra-redundant and should be
 easily fixable, it seems that I do still have the
 problem. It does not show itself in practice as of
 yet, but is found by scrub ;)

Hmm. Interesting.

I have re-scrubbed pool that I referenced. I haven't gotten another
error in the metadata.

 After a few days to complete the current scrub,
 I plan to run zdb as asked by Steve. If anyone else
 has some theories, suggestions or requests to dig
 up more clues - bring them on! ;)

Perhaps the cause of the corruption is still active.

The circumstances that lead up to the discovery of the error are
different for you and me. The server that I encountered it on was
running fine for months, it was only after the crash/hang because of
attempting to add a bad drive to the pool that I encountered the issue
which was found on boot in my case.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...

2011-12-05 Thread Jim Klimov

Well, I have an intermediate data point. One scrub run
completed without finding any newer errors (beside one
at the pool-level and two and the raidz2-level).

Zpool clear alone did not fix it, meaning that the
pool:metadata:0x0 was still reported as problematic,
but a second attempt at zpool clear did clear the
errors from zpool status.

Before running zdb as asked by other commentors,
I decided to rescrub. At some point between 200Gb
and 1.7Tb scanned, the errors returned to the stats.

So, in contrast with Nigel's optimistic theory that
metadata is anyway extra-redundant and should be
easily fixable, it seems that I do still have the
problem. It does not show itself in practice as of
yet, but is found by scrub ;)

After a few days to complete the current scrub,
I plan to run zdb as asked by Steve. If anyone else
has some theories, suggestions or requests to dig
up more clues - bring them on! ;)


2011-12-02 20:08, Nigel W wrote:

On Fri, Dec 2, 2011 at 02:58, Jim Klimovjimkli...@cos.ru  wrote:

My question still stands: is it possible to recover
from this error or somehow safely ignore it? ;)
I mean, without backing up data and recreating the
pool?

If the problem is in metadata but presumably the
pool still works, then this particular metadata
is either not critical or redundant, and somehow
can be forged and replaced by valid metadata.
Is this a rightful path of thought?

Are there any tools to remake such a metadata
block?

Again, I did not try to export/reimport the pool
yet, except for that time 3 days ago when the
machine hung, was reset and imported the pool
and continued the scrub automatically...

I think it is now too late to do an export and
a rollback import, too...



Unfortunately I cannot provide you with a direct answer as I have only
been a user of ZFS for about a year and in that time only encountered
this once.

Anecdotally, at work I had something similar happen to a Nexcenta Core
3.0 (b134) box three days ago (seemingly caused by a hang then
eventual panic as a result of attempting to add a drive that is having
read failures to the pool).  When the box came back up, zfs reported
an error in metadata:0x0.  We scrubbed the tank (~400GB used) and like
in your case the checksum error didn't clear.  We ran a scrub again
and it seems that the second scrub did clear the metadata error.

I don't know if that means it will work that way for everyone, every
time, or not.  But considering that the pool and the data on it
appears to be fine (just not having any replicas until we get the bad
disk replaced) and that all metadata is supposed to havecopies+1
copies (with an apparent max of 3 copies[1]) on the pool at all times
I can't see why this error shouldn't be cleared by a scrub.

[1] http://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection


//Jim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...

2011-12-02 Thread Jim Klimov

2011-12-02 18:25, Steve Gonczi пишет:

Hi Jim,

Try to run a zdb -b poolname ..
This should report any leaked or double allocated blocks.
(It may or may not run, it tends to run out of memory and crash on large
datasets)
I would be curious what zdb reports, and whether you are able to run it w/o
crashing with out of memory.



Ok, when/if it completes scrubbing the pool, I'll try that.
But it is likely to fail, unless there are some new failsafe
workarounds for such failures in oi_151a.

In the meanwhile, here are copies of zdb walks which I did
a couple of weeks ago while repairing (finally replacing)
the rpool on this box. At that time it was booted with
oi_148a LiveUSB. Some of the walks (those WITH leak-checks
not disabled) never completed:

root@openindiana:~# time zdb -bb -e 1601233584937321596

Traversing all blocks to verify nothing leaked ...

(box hung: LAN Disconnected; RAM/SWAP used up according to vmstat 1)



root@openindiana:~# time zdb -bsvc -e 1601233584937321596
Traversing all blocks to verify checksums and verify nothing leaked ...

Assertion failed: zio_wait(zio_claim(0L, zcb-zcb_spa, refcnt ? 0 : 
spa_first_txg(zcb-zcb_spa), bp, 0L, 0L, ZIO_FLAG_CANFAIL)) == 0 (0x2 == 
0x0), file ../zdb.c, line 1950

Abort

real7197m41.288s
user291m39.256s
sys 25m48.133s

This took most of the week just to fail.

And a walk without leak checks took half a day to find
some discrepancies and unreachable blocks:

root@openindiana:~# time zdb -bsvL -e 1601233584937321596
Traversing all blocks ...
block traversal size 9044729487360 != alloc 9044729499648
(unreachable 12288)

bp count:85245222
bp logical:8891466103808  avg: 104304
bp physical:   7985508591104  avg:  93676 compression: 
  1.11
bp allocated:  12429007810560  avg: 145802 compression: 
  0.72
bp deduped:3384278323200ref1: 13909855 
deduplication:   1.27

SPA allocated: 9044729499648 used: 75.64%

Blocks  LSIZE   PSIZE   ASIZE avgcomp   %Total  Type
 -  -   -   -   -   --  unallocated
 232K  4K   72.0K   36.0K8.00 0.00  object directory
 3  1.50K   1.50K108K   36.0K1.00 0.00  object array
 232K   2.50K   72.0K   36.0K   12.80 0.00  packed nvlist
 -  -   -   -   -   --  packed nvlist size
 7.80K   988M208M   1.12G147K4.75 0.01  bpobj
 -  -   -   -   -   --  bpobj header
 -  -   -   -   -   --  SPA space map 
header

  183K   753M517M   6.49G   36.3K1.46 0.06  SPA space map
22  1020K   1020K   1.58M   73.6K1.00 0.00  ZIL intent log
  933K  14.6G   3.11G   25.2G   27.6K4.69 0.22  DMU dnode
 1.75K  3.50M896K   42.0M   24.0K4.00 0.00  DMU objset
 -  -   -   -   -   --  DSL directory
   390   243K200K   13.7M   36.0K1.21 0.00  DSL directory 
child map
   388   298K208K   13.6M   36.0K1.43 0.00  DSL dataset 
snap map

   715  10.2M   1.14M   25.1M   36.0K8.92 0.00  DSL props
 -  -   -   -   -   --  DSL dataset
 -  -   -   -   -   --  ZFS znode
 -  -   -   -   -   --  ZFS V0 ACL
 76.1M  8.06T   7.25T   11.2T150K1.1198.67  ZFS plain file
 2.17M  2.76G   1.33G   52.7G   24.3K2.08 0.46  ZFS directory
   341   314K171K   7.99M   24.0K1.84 0.00  ZFS master node
   857  25.5M   1.16M   20.1M   24.1K   21.94 0.00  ZFS delete queue
 -  -   -   -   -   --  zvol object
 -  -   -   -   -   --  zvol prop
 -  -   -   -   -   --  other uint8[]
 -  -   -   -   -   --  other uint64[]
 -  -   -   -   -   --  other ZAP
 -  -   -   -   -   --  persistent 
error log

33  4.02M763K   4.46M139K5.39 0.00  SPA history
 -  -   -   -   -   --  SPA history offsets
 1512 512   36.0K   36.0K1.00 0.00  Pool properties
 -  -   -   -   -   --  DSL permissions
 17.1K  12.7M   8.63M411M   24.0K1.48 0.00  ZFS ACL
 -  -   -   -   -   --  ZFS SYSACL
 5  80.0K   5.00K120K   24.0K   16.00 0.00  FUID table
 -  -   -   -   -   --  FUID table size
 1.37K   723K705K   49.3M   36.0K1.03 0.00  DSL dataset 
next clones

 -  -   -   -   -   --  scan work queue
 2.69K  2.57M   1.36M   64.6M   24.0K1.89 0.00  ZFS user/group used
 -  -   -   -   -   --  ZFS user/group 
quota
 -  -   -   -  

Re: [zfs-discuss] Scrub found error in metadata:0x0, is that always fatal? No checks um errors now...

2011-12-02 Thread Nigel W
On Fri, Dec 2, 2011 at 02:58, Jim Klimov jimkli...@cos.ru wrote:
 My question still stands: is it possible to recover
 from this error or somehow safely ignore it? ;)
 I mean, without backing up data and recreating the
 pool?

 If the problem is in metadata but presumably the
 pool still works, then this particular metadata
 is either not critical or redundant, and somehow
 can be forged and replaced by valid metadata.
 Is this a rightful path of thought?

 Are there any tools to remake such a metadata
 block?

 Again, I did not try to export/reimport the pool
 yet, except for that time 3 days ago when the
 machine hung, was reset and imported the pool
 and continued the scrub automatically...

 I think it is now too late to do an export and
 a rollback import, too...


Unfortunately I cannot provide you with a direct answer as I have only
been a user of ZFS for about a year and in that time only encountered
this once.

Anecdotally, at work I had something similar happen to a Nexcenta Core
3.0 (b134) box three days ago (seemingly caused by a hang then
eventual panic as a result of attempting to add a drive that is having
read failures to the pool).  When the box came back up, zfs reported
an error in metadata:0x0.  We scrubbed the tank (~400GB used) and like
in your case the checksum error didn't clear.  We ran a scrub again
and it seems that the second scrub did clear the metadata error.

I don't know if that means it will work that way for everyone, every
time, or not.  But considering that the pool and the data on it
appears to be fine (just not having any replicas until we get the bad
disk replaced) and that all metadata is supposed to have copies+1
copies (with an apparent max of 3 copies[1]) on the pool at all times
I can't see why this error shouldn't be cleared by a scrub.

[1] http://blogs.oracle.com/relling/entry/zfs_copies_and_data_protection
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss