[DRBD-user] Managed to corrupt a drbd9 resource, but don't fully understand how

Eddie Chapman Mon, 29 Jul 2019 02:42:22 -0700

Hello,

I've managed to corrupt one of the drbd9 resources on one of myproduction servers, now I'm trying to figure out exactly what happenedso I can try and recover the data. I wonder if anyone might understandwhat went wrong here ( apart from the fact that PEBCAK :-) )?

This is a simple two node resource with an ext4 fs directly on top andan lvm volume as backing device, nothing special.

Gentoo, drbd kernel 9.0.19-1, drbd utilities 9.10.0, vanilla kernel.org4.19.60


Here's the sequence of events:

1. The resource is online in its normal state, connected, dstateUpToDate on both nodes. It is Primary/Secondary and I'm working on thePrimary node. I unmount the fs on top of the resource.


2. I run: e2fsck -f /dev/drbd12
which completes without any problems.

3. I run: drbdadm secondary res0
on the Primary.
The cstate is now Secondary/Secondary

4. I run: drbdadm down res0
on both nodes. Now the resource is fully down on both nodes.

5. I destroy the backing device on the node which is normally Secondary(wish I hadn't now but it was a calculated risk that was part of myoriginal plan - heh). Now only one node exists.


6. I run: drbdadm up res0

on the remaining node. The resource comes up fine, it is now Secondary,UpToDate but disconnected. The fs on top is NOT mounted.

(Oops, later I realise I forgot to make the resource Primary, so itstays Secondary. But not sure if this is a factor at all in what follows.)


7. I extend the resource's backing device:

lvextend -l +11149 /dev/vg02/res0

Size of logical volume vg02/res0 changed from 120.00 GiB (30720extents) to 163.55 GiB (41869 extents).

  Logical volume vg02/res0 successfully resized.

Oops, I made a mistake in my lvextend, I wanted to extend using extentsfrom a specific PV but forgot to add the PV and extents range on the endof the lvextend command. No problem, I know the PV segments and theirextent ranges from before the extend as I ran lvdisplay -m before I didanything. So I can simply delete and re-create the LV using the exactsame segments and extent ranges to restore it to how it was (this ismuch easier than trying to remove specific PV segments from a LV whichlvreduce/lvresize cannot do AFAICT, and an approach I've usedsuccessfully many times before). However I can't run lvremove on thebacking device while it is still held open by drbd so ...


8. I run: drbdadm down res0
which completes successfully.

Later I look back at the kernel log and see that when I downed theresource in this step, the log is normal and as you would expect, apartfrom this 1 line:

drbd xtra3/0 drbd12: ASSERTION drbd_md_ss(device->ldev) ==device->ldev->md.md_offset FAILED in drbd_md_write

A very obvious sign something is very wrong at this point, but of courseI don't see that until later when doing my post mortem :-)

Apart from that 1 line all other lines logged when downing the deviceare normal. I've pasted the full log at the bottom of this mail.

9. With the resource down, I now am able to successfully delete andre-create the backing device. These are my commands:


lvremove /dev/vg02/res0
Do you really want to remove active logical volume vg02/res0? [y/n]: y
  Logical volume "res0" successfully removed

lvcreate -l 30720 -n res0 vg02 /dev/sdf1:36296-59597/dev/sdf1:7797-12998 /dev/sdf1:27207-29422WARNING: ext4 signature detected on /dev/vg02/res0 at offset 1080. Wipeit? [y/n]: n

  Aborted wiping of ext4.
  1 existing signature left on the device.
  Logical volume "res0" created.

I run lvdisplay -m to confirm the list of segments and extent ranges isexactly as it was before I extended the LV in step 7, which it is.

Wonderful, I continue under my delusion that this is all going fine sofar :-)


10. I run: drbdadm up res0
and then:
drbdadm primary res0

which both complete successfully. I've pasted the full kernel log fromthese 2 commands at the end in full.


11. I run: drbdadm primary res0
which completes successfully. All looks fine to me at this point.

12. I decide to run a forced e2fsck on the resource to double check allis OK. Here is now the first time I realise something has gone badly wrong:


e2fsck -f /dev/drbd12
e2fsck 1.45.3 (14-Jul-2019)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
Pass 1: Checking inodes, blocks, and sizes
Inode 262145 seems to contain garbage.  Clear<y>?
/dev/drbd12: e2fsck canceled.

/dev/drbd12: ***** FILE SYSTEM WAS MODIFIED *****

I Ctrl-C as soon as it asked me to clear that inode, but too late fs hasbeen modified in some way.

I run dumpe2fs which dumps a full set of info on the fs which all looksvery normal apart from the line:


Filesystem state:         not clean

13. I run e2fsck again this time with -n (which I should have done firsttime round silly me). I starts out:


e2fsck -f -n /dev/drbd12
e2fsck 1.45.3 (14-Jul-2019)
Pass 1: Checking inodes, blocks, and sizes
Inode 262145 seems to contain garbage.  Clear? no

Inode 262146 passes checks, but checksum does not match inode.  Fix? no
<...repeat same line for hundreds of inodes...>

So, obviously I made a big mistake when I extended the backing device(while it was Secondary though not sure if that is a factor) and thenimmediately brought down the resource without following through andrunning "drbdadm resize", and the drbd ASSERTION shows that.

However, at the time I wasn't expecting any problems because I had notmounted the resource. So, in my mind, as I had not made any changes toany filesystem data, I thought that would be fine.

Obviously, what happened is the backing device *size* changed while theresource was live. This should be fine in itself as it is what normallyhappens in a drbd online resize.

However, by then bringing down the resource, looks like drbd hasprobably tried to write its metadata at the end of the backing deviceand this fails because of the size change?

Ultimately all my fault of course for not following through the onlineresize.

However, now I want to try and recover my fs and I just want tounderstand what exactly happened with the metadata when the resource wasbrought down and there was the ASSERTION?

I want to understand that before I try and run e2fsck without -n, whichI've seen is going to make drastic changes to the fs, which may recoverthe fs, but may also make things worse at this point.

Perhaps I should just wipe-md and try and access the fs directly on thebacking device.

By the way, I do take a daily backup of the data on almost all my otherdrbd resources. Just not this one unfortunately, the one I happen tomess up ... :-)


Thanks in advance,
Eddie


==== Full kernel log from Step 8 =====

Jul 28 18:18:52 node1 kernel: [384749.421172] drbd res0 tv: conn(Connecting -> Disconnecting )Jul 28 18:18:52 node1 kernel: [384749.421217] drbd res0 tv: Restartingsender threadJul 28 18:18:52 node1 kernel: [384749.421230] drbd res0/0 drbd12:ASSERTION drbd_md_ss(device->ldev) == device->ldev->md.md_offset FAILEDin drbd_md_writeJul 28 18:18:52 node1 kernel: [384749.421309] drbd res0 tv: ConnectionclosedJul 28 18:18:52 node1 kernel: [384749.421318] drbd res0 tv: conn(Disconnecting -> StandAlone )Jul 28 18:18:52 node1 kernel: [384749.421326] drbd res0 tv: Terminatingreceiver threadJul 28 18:18:52 node1 kernel: [384749.421341] drbd res0 tv: Terminatingsender threadJul 28 18:18:52 node1 kernel: [384749.421435] drbd res0/0 drbd12: disk(UpToDate -> Detaching )Jul 28 18:18:52 node1 kernel: [384749.421497] drbd res0/0 drbd12: disk(Detaching -> Diskless )Jul 28 18:18:52 node1 kernel: [384749.421908] drbd res0/0 drbd12:drbd_bm_resize called with capacity == 0Jul 28 18:18:52 node1 kernel: [384749.454400] drbd res0: Terminatingworker thread


==== Full kernel log from Steps 10 and 11 =====

Jul 28 18:24:51 node1 kernel: [385108.607668] drbd res0: Starting workerthread (from drbdsetup [26084])Jul 28 18:24:51 node1 kernel: [385108.610730] drbd res0 tv: Startingsender thread (from drbdsetup [26088])Jul 28 18:24:51 node1 kernel: [385108.790058] drbd res0/0 drbd12:meta-data IO uses: blk-bioJul 28 18:24:51 node1 kernel: [385108.790213] drbd res0/0 drbd12: disk(Diskless -> Attaching )Jul 28 18:24:51 node1 kernel: [385108.790220] drbd res0/0 drbd12:Maximum number of peer devices = 1Jul 28 18:24:51 node1 kernel: [385108.790623] drbd res0: Method toensure write ordering: drainJul 28 18:24:51 node1 kernel: [385108.790633] drbd res0/0 drbd12:drbd_bm_resize called with capacity == 251650488Jul 28 18:24:51 node1 kernel: [385108.793611] drbd res0/0 drbd12: resyncbitmap: bits=31456311 words=491505 pages=960Jul 28 18:24:51 node1 kernel: [385108.793616] drbd res0/0 drbd12: size =120 GB (125825244 KB)Jul 28 18:24:51 node1 kernel: [385108.793617] drbd res0/0 drbd12: size =120 GB (125825244 KB)Jul 28 18:24:51 node1 kernel: [385108.830061] drbd res0/0 drbd12:recounting of set bits took additional 0msJul 28 18:24:51 node1 kernel: [385108.830072] drbd res0/0 drbd12: disk(Attaching -> UpToDate )Jul 28 18:24:51 node1 kernel: [385108.830073] drbd res0/0 drbd12:attached to current UUID: 51A43103091A9E70Jul 28 18:24:51 node1 kernel: [385108.831297] drbd res0 tv: conn(StandAlone -> Unconnected )Jul 28 18:24:51 node1 kernel: [385108.831394] drbd res0 tv: Startingreceiver thread (from drbd_w_res0 [26085])Jul 28 18:24:51 node1 kernel: [385108.831455] drbd res0 tv: conn(Unconnected -> Connecting )Jul 28 18:25:54 node1 kernel: [385172.119565] drbd res0: Preparingcluster-wide state change 840522528 (0->-1 3/1)Jul 28 18:25:54 node1 kernel: [385172.119569] drbd res0: Committingcluster-wide state change 840522528 (0ms)Jul 28 18:25:54 node1 kernel: [385172.119572] drbd res0: role( Secondary-> Primary )Jul 28 18:25:59 node1 kernel: [385176.861534] drbd res0/0 drbd12: newcurrent UUID: 8BE37DA3CE798447 weak: FFFFFFFFFFFFFFFE




_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

[DRBD-user] Managed to corrupt a drbd9 resource, but don't fully understand how

Reply via email to