> On Jan 25, 2016, at 12:50 PM, Youzhong Yang <[email protected]> wrote:
> 
> Hi all,
> 
> Just wondering if anyone has done similar recovery using txg stuff.

Yes. I've seen it successfully done only once in my entire life -- a special 
case where
one node had no significant writes during the dual import period.

> 
> We have a zpool attached to two hosts physically, ideally at any time only 
> one host imports this zpool. Due to some operational mistake this zpool was 
> corrupted when the two hosts tried to have access to it. Here is the crash 
> stack:
> 
> Jan 25 10:07:17 batfs0346 genunix: [ID 403854 kern.notice] assertion failed: 
> 0 == dmu_bonus_hold(spa->spa_meta_objset, obj, FTAG, &db), file: 
> ../../common/fs/zfs/spa.c, line: 1549
> Jan 25 10:07:17 batfs0346 unix: [ID 100000 kern.notice]
> Jan 25 10:07:17 batfs0346 genunix: [ID 802836 kern.notice] ffffff017495c920 
> fffffffffba6b1f8 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495c9a0 
> zfs:load_nvlist+e8 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495ca90 
> zfs:spa_load_impl+10bb ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cb30 
> zfs:spa_load+14e ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cb80 
> zfs:spa_tryimport+aa ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cbd0 
> zfs:zfs_ioc_pool_tryimport+51 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cc80 
> zfs:zfsdev_ioctl+4a7 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495ccc0 
> genunix:cdev_ioctl+39 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cd10 
> specfs:spec_ioctl+60 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cda0 
> genunix:fop_ioctl+55 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cec0 
> genunix:ioctl+9b ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cf10 
> unix:brand_sys_sysenter+1c9 ()
> 
> Is it possible to roll back the zpool to its last known good txg? We know 
> when the zpool should be in good state.
> 
> Any suggestion would be very much appreciated. We can build a kernel if 
> needed.

Some tips:
+ make snapshots or dd-like copies of the raw drives, if feasible
+ prevent future damage or unexpected repairs by importing readonly
+ zdb does analysis of pools without changing the on-disk data
+ zdb -F attempts a conservative rewind to previous uberblocks
+ zdb -X attempts an extreme rewind to previous uberblocks automatically
+ zdb -lu shows uberblocks and their txgs
+ zdb -t allows you to check on-disk data structures for specific txgs (from 
zdb -lu)
+ once you find a txg that seems to work with zdb, you can try readonly zpool 
import
    using -F, -X, or -T (don't forget readonly)
+ if readonly import works, then you can try to recover the data and later try 
readwrite import

Good luck
 -- richard



-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to