> On Jan 25, 2016, at 12:50 PM, Youzhong Yang <[email protected]> wrote:
>
> Hi all,
>
> Just wondering if anyone has done similar recovery using txg stuff.
Yes. I've seen it successfully done only once in my entire life -- a special
case where
one node had no significant writes during the dual import period.
>
> We have a zpool attached to two hosts physically, ideally at any time only
> one host imports this zpool. Due to some operational mistake this zpool was
> corrupted when the two hosts tried to have access to it. Here is the crash
> stack:
>
> Jan 25 10:07:17 batfs0346 genunix: [ID 403854 kern.notice] assertion failed:
> 0 == dmu_bonus_hold(spa->spa_meta_objset, obj, FTAG, &db), file:
> ../../common/fs/zfs/spa.c, line: 1549
> Jan 25 10:07:17 batfs0346 unix: [ID 100000 kern.notice]
> Jan 25 10:07:17 batfs0346 genunix: [ID 802836 kern.notice] ffffff017495c920
> fffffffffba6b1f8 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495c9a0
> zfs:load_nvlist+e8 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495ca90
> zfs:spa_load_impl+10bb ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cb30
> zfs:spa_load+14e ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cb80
> zfs:spa_tryimport+aa ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cbd0
> zfs:zfs_ioc_pool_tryimport+51 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cc80
> zfs:zfsdev_ioctl+4a7 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495ccc0
> genunix:cdev_ioctl+39 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cd10
> specfs:spec_ioctl+60 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cda0
> genunix:fop_ioctl+55 ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cec0
> genunix:ioctl+9b ()
> Jan 25 10:07:17 batfs0346 genunix: [ID 655072 kern.notice] ffffff017495cf10
> unix:brand_sys_sysenter+1c9 ()
>
> Is it possible to roll back the zpool to its last known good txg? We know
> when the zpool should be in good state.
>
> Any suggestion would be very much appreciated. We can build a kernel if
> needed.
Some tips:
+ make snapshots or dd-like copies of the raw drives, if feasible
+ prevent future damage or unexpected repairs by importing readonly
+ zdb does analysis of pools without changing the on-disk data
+ zdb -F attempts a conservative rewind to previous uberblocks
+ zdb -X attempts an extreme rewind to previous uberblocks automatically
+ zdb -lu shows uberblocks and their txgs
+ zdb -t allows you to check on-disk data structures for specific txgs (from
zdb -lu)
+ once you find a txg that seems to work with zdb, you can try readonly zpool
import
using -F, -X, or -T (don't forget readonly)
+ if readonly import works, then you can try to recover the data and later try
readwrite import
Good luck
-- richard
-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription:
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com