I've had some success. I started with the ZFS on-disk format PDF.
http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf The uberblocks all have magic value 0x00bab10c. Used od -x to find that value in the vdev. r...@opensolaris:~# od -A x -x /mnt/zpool.zones | grep "b10c 00ba" 020000 b10c 00ba 0000 0000 0004 0000 0000 0000 020400 b10c 00ba 0000 0000 0004 0000 0000 0000 020800 b10c 00ba 0000 0000 0004 0000 0000 0000 020c00 b10c 00ba 0000 0000 0004 0000 0000 0000 021000 b10c 00ba 0000 0000 0004 0000 0000 0000 021400 b10c 00ba 0000 0000 0004 0000 0000 0000 021800 b10c 00ba 0000 0000 0004 0000 0000 0000 021c00 b10c 00ba 0000 0000 0004 0000 0000 0000 022000 b10c 00ba 0000 0000 0004 0000 0000 0000 022400 b10c 00ba 0000 0000 0004 0000 0000 0000 ... So the uberblock array begins 128kB into the vdev and there's an uberblock every 1kb. To identify the active uberblock I used zdb. r...@kestrel:/opt$ zdb -U -uuuv zones Uberblock magic = 0000000000bab10c version = 4 txg = 1504158 (= 0x16F39E) guid_sum = 10365405068077835008 = (0x8FD950FDBBD02300) timestamp = 1229142108 UTC = Sat Dec 13 15:21:48 2008 = (0x4943385C) rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:52e3edc00:200> DVA[1]=<0:6f9c1d600:200> DVA[2]=<0:16e280400:200> fletcher4 lzjb LE contiguous birth=1504158 fill=172 cksum=b0a5275f3:474e0ed6469:e993ed9bee4d:205661fa1d4016 I spy those hex values at the uberblock starting 027800. 027800 b10c 00ba 0000 0000 0004 0000 0000 0000 027810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9 027820 385c 4943 0000 0000 0001 0000 0000 0000 027830 1f6e 0297 0000 0000 0001 0000 0000 0000 027840 e0eb 037c 0000 0000 0001 0000 0000 0000 027850 1402 00b7 0000 0000 0001 0000 0703 800b 027860 0000 0000 0000 0000 0000 0000 0000 0000 027870 0000 0000 0000 0000 f39e 0016 0000 0000 027880 00ac 0000 0000 0000 75f3 0a52 000b 0000 027890 6469 e0ed 0474 0000 ee4d ed9b e993 0000 0278a0 4016 fa1d 5661 0020 0000 0000 0000 0000 0278b0 0000 0000 0000 0000 0000 0000 0000 0000 Breaking it down * the first 8 bytes are the magic uberblock number (b10c 00ba 0000 0000) * the second 8 bytes are the version number (0004 0000 0000 0000) * the third 8 bytes are the transaction group a.k.a txg (f39e 0016 0000 0000) * the fourth 8 bytes are the guid sum (2300 bbd0 50fd 8fd9) * the fifth 8 bytes are the timestamp (385c 4943 0000 0000) The remainder of the bytes are the "blkptr" structure and I'll ignore them. Those values match the active uberblock exactly, so I know this is the on-disk location of the first active uberblock. Scanning further I find an exact duplicate 256kB later in the device. 067800 b10c 00ba 0000 0000 0004 0000 0000 0000 067810 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9 067820 385c 4943 0000 0000 0001 0000 0000 0000 067830 1f6e 0297 0000 0000 0001 0000 0000 0000 067840 e0eb 037c 0000 0000 0001 0000 0000 0000 067850 1402 00b7 0000 0000 0001 0000 0703 800b 067860 0000 0000 0000 0000 0000 0000 0000 0000 067870 0000 0000 0000 0000 f39e 0016 0000 0000 067880 00ac 0000 0000 0000 75f3 0a52 000b 0000 067890 6469 e0ed 0474 0000 ee4d ed9b e993 0000 0678a0 4016 fa1d 5661 0020 0000 0000 0000 0000 0678b0 0000 0000 0000 0000 0000 0000 0000 0000 I know ZPOOL keeps four copies of the label; two at the front and two at the back, each 256kB in size. r...@opensolaris:~# ls -l /mnt/zpool.zones -rw-r--r-- 1 root root 42949672960 Dec 15 04:49 /mnt/zpool.zones That's 0xA00000000 = 42949672960 = 41943040kB. If I subtract 512kB I should see the third and fourth labels. r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942528 | od -A x -x | grep "385c 4943 0000 0000" 027820 385c 4943 0000 0000 0001 0000 0000 0000 512+0 records in 512+0 records out 524288 bytes (524 kB) copied, 0.0577013 s, 9.1 MB/s r...@opensolaris:~# Oddly enough I see the third uberblock at 0x27800 but the fourth uberblock at 0x67800 is missing. Perhaps corrupted? No matter. I now work out the exact offsets to the three valid uberblocks and confirm I'm looking at the right uberblocks. r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=158 | od -A x -x | head -3 000000 b10c 00ba 0000 0000 0004 0000 0000 0000 000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9 000020 385c 4943 0000 0000 0001 0000 0000 0000 r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=414 | od -A x -x | head -3 000000 b10c 00ba 0000 0000 0004 0000 0000 0000 000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9 000020 385c 4943 0000 0000 0001 0000 0000 0000 r...@opensolaris:~# dd if=/mnt/zpool.zones bs=1k skip=41942686 | od -A x -x | head -3 000000 b10c 00ba 0000 0000 0004 0000 0000 0000 000010 f39e 0016 0000 0000 2300 bbd0 50fd 8fd9 000020 385c 4943 0000 0000 0001 0000 0000 0000 They all have the same timestamp. I'm looking at the correct uberblocks. Now I intentionally harm them. r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=158 count=1 conv=notrunc 1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 0.000315229 s, 3.2 MB/s r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=414 count=1 conv=notrunc 1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 3.5e-08 s, 29.3 GB/s r...@opensolaris:/mnt# dd if=/dev/zero of=/mnt/zpool.zones bs=1k seek=41942686 count=1 conv=notrunc 1+0 records in 1+0 records out 1024 bytes (1.0 kB) copied, 0.00192728 s, 531 kB/s And... fingers crossed... r...@opensolaris:/mnt# zpool import -d /mnt -f zones r...@opensolaris:/mnt# Huzzah, the import worked. r...@opensolaris:/mnt# zpool status pool: zones state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on older software versions. scrub: none requested config: NAME STATE READ WRITE CKSUM zones ONLINE 0 0 0 /mnt/zpool.zones ONLINE 0 0 0 errors: No known data errors And my filesystems are back. r...@opensolaris:/mnt# zfs list NAME USED AVAIL REFER MOUNTPOINT zones 23.7G 15.5G 27K /zones zones/appserver 1.69G 15.5G 5.55G /zones/appserver zones/base 847M 15.5G 4.20G /zones/base zones/centos 1.35G 15.5G 1.34G /zones/centos zones/cgiserver 2.43G 15.5G 6.24G /zones/cgiserver zones/ds1 5.47G 15.5G 3.91G /zones/ds1 zones/ds2 616M 15.5G 3.88G /zones/ds2 zones/webserver 11.3G 15.5G 15.1G /zones/webserver Initial inspection of the filesystems are promising. I can read from files, there are no panics, everything seems to be intact. I hope this helps other people recover corrupted zpools, until such time as there are tools to automate this process. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss