I've had some success.

I started with the ZFS on-disk format PDF.

http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

The uberblocks all have magic value 0x00bab10c. Used od -x to find that value 
in the vdev.

r...@opensolaris:~# od -A x -x /mnt/zpool.zones | grep "b10c 00ba"
020000 b10c 00ba 0000 0000 0004 0000 0000 0000
020400 b10c 00ba 0000 0000 0004 0000 0000 0000
020800 b10c 00ba 0000 0000 0004 0000 0000 0000
020c00 b10c 00ba 0000 0000 0004 0000 0000 0000
021000 b10c 00ba 0000 0000 0004 0000 0000 0000
021400 b10c 00ba 0000 0000 0004 0000 0000 0000
021800 b10c 00ba 0000 0000 0004 0000 0000 0000
021c00 b10c 00ba 0000 0000 0004 0000 0000 0000
022000 b10c 00ba 0000 0000 0004 0000 0000 0000
022400 b10c 00ba 0000 0000 0004 0000 0000 0000
...

So the uberblock array begins 128kB into the vdev and there's an uberblock 
every 1kb.

To identify the active uberblock I used zdb.

r...@kestrel:/opt$ zdb -U -uuuv zones
Uberblock
        magic = 0000000000bab10c
        version = 4
        txg = 1504158 (= 0x16F39E) 
        guid_sum = 10365405068077835008 = (0x8FD950FDBBD02300)
        timestamp = 1229142108 UTC = Sat Dec 13 15:21:48 2008 = (0x4943385C)
        rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:52e3edc00:200> 
DVA[1]=<0:6f9c1d600:200> DVA[2]=<0:16e280400:200> fletcher4 lzjb LE contiguous 
birth=1504158 fill=172 cksum=b0a5275f3:474e0ed6469:e993ed9bee4d:205661fa1d4016

I spy those hex values at the uberblock starting 027800.

027800 b10c 00ba 0000 0000 0004 0000 0000 0000
027810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
027820 385c 4943 0000 0000 0001 0000 0000 0000
027830 1f6e 0297 0000 0000 0001 0000 0000 0000
027840 e0eb 037c 0000 0000 0001 0000 0000 0000
027850 1402 00b7 0000 0000 0001 0000 0703 800b
027860 0000 0000 0000 0000 0000 0000 0000 0000
027870 0000 0000 0000 0000 f39e 0016 0000 0000
027880 00ac 0000 0000 0000 75f3 0a52 000b 0000
027890 6469 e0ed 0474 0000 ee4d ed9b e993 0000
0278a0 4016 fa1d 5661 0020 0000 0000 0000 0000
0278b0 0000 0000 0000 0000 0000 0000 0000 0000

Breaking it down

* the first 8 bytes are the magic uberblock number (b10c 00ba 0000 0000)
* the second 8 bytes are the version number (0004 0000 0000 0000)
* the third 8 bytes are the transaction group a.k.a txg (f39e 0016 0000 0000)
* the fourth 8 bytes are the guid sum (2300 bbd0 50fd 8fd9)
* the fifth 8 bytes are the timestamp (385c 4943 0000 0000)

The remainder of the bytes are the "blkptr" structure and I'll ignore them.

Those values match the active uberblock exactly, so I know this is the on-disk 
location of the first active uberblock.

Scanning further I find an exact duplicate 256kB later in the device.

067800 b10c 00ba 0000 0000 0004 0000 0000 0000
067810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
067820 385c 4943 0000 0000 0001 0000 0000 0000
067830 1f6e 0297 0000 0000 0001 0000 0000 0000
067840 e0eb 037c 0000 0000 0001 0000 0000 0000
067850 1402 00b7 0000 0000 0001 0000 0703 800b
067860 0000 0000 0000 0000 0000 0000 0000 0000
067870 0000 0000 0000 0000 f39e 0016 0000 0000
067880 00ac 0000 0000 0000 75f3 0a52 000b 0000
067890 6469 e0ed 0474 0000 ee4d ed9b e993 0000
0678a0 4016 fa1d 5661 0020 0000 0000 0000 0000
0678b0 0000 0000 0000 0000 0000 0000 0000 0000

I know ZPOOL keeps four copies of the label; two at the front and two at the 
back, each 256kB in size.

r...@opensolaris:~# ls -l /mnt/zpool.zones 
-rw-r--r-- 1 root root 42949672960 Dec 15 04:49 /mnt/zpool.zones

That's 0xA00000000 = 42949672960 = 41943040kB. If I subtract 512kB I should see 
the third and fourth labels.

r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942528 | od -A x -x | 
grep "385c 4943 0000 0000"
027820 385c 4943 0000 0000 0001 0000 0000 0000
512+0 records in
512+0 records out
524288 bytes (524 kB) copied, 0.0577013 s, 9.1 MB/s
r...@opensolaris:~# 

Oddly enough I see the third uberblock at 0x27800 but the fourth uberblock at 
0x67800 is missing. Perhaps corrupted?

No matter. I now work out the exact offsets to the three valid uberblocks and 
confirm I'm looking at the right uberblocks.

r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=158 | od -A x -x | head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000
r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=414 | od -A x -x | head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000
r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942686 | od -A x -x | 
head -3
000000 b10c 00ba 0000 0000 0004 0000 0000 0000
000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9
000020 385c 4943 0000 0000 0001 0000 0000 0000

They all have the same timestamp. I'm looking at the correct uberblocks. Now I 
intentionally harm them.

r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=158 
count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.000315229 s, 3.2 MB/s
r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=414 
count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 3.5e-08 s, 29.3 GB/s
r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=41942686 
count=1 conv=notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.00192728 s, 531 kB/s

And... fingers crossed...

r...@opensolaris:/mnt# zpool import -d /mnt -f zones
r...@opensolaris:/mnt# 

Huzzah, the import worked.

r...@opensolaris:/mnt# zpool status
  pool: zones
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on older software versions.
 scrub: none requested
config:

        NAME                STATE     READ WRITE CKSUM
        zones               ONLINE       0     0     0
          /mnt/zpool.zones  ONLINE       0     0     0

errors: No known data errors

And my filesystems are back.

r...@opensolaris:/mnt# zfs list
NAME              USED  AVAIL  REFER  MOUNTPOINT
zones            23.7G  15.5G    27K  /zones
zones/appserver  1.69G  15.5G  5.55G  /zones/appserver
zones/base        847M  15.5G  4.20G  /zones/base
zones/centos     1.35G  15.5G  1.34G  /zones/centos
zones/cgiserver  2.43G  15.5G  6.24G  /zones/cgiserver
zones/ds1        5.47G  15.5G  3.91G  /zones/ds1
zones/ds2         616M  15.5G  3.88G  /zones/ds2
zones/webserver  11.3G  15.5G  15.1G  /zones/webserver

Initial inspection of the filesystems are promising. I can read from files, 
there are no panics, everything seems to be intact.

I hope this helps other people recover corrupted zpools, until such time as 
there are tools to automate this process.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to