Hello,

I've managed to corrupt one of the drbd9 resources on one of my production servers, now I'm trying to figure out exactly what happened so I can try and recover the data. I wonder if anyone might understand what went wrong here ( apart from the fact that PEBCAK :-) )?

This is a simple two node resource with an ext4 fs directly on top and an lvm volume as backing device, nothing special.

Gentoo, drbd kernel 9.0.19-1, drbd utilities 9.10.0, vanilla kernel.org 4.19.60

Here's the sequence of events:

1. The resource is online in its normal state, connected, dstate UpToDate on both nodes. It is Primary/Secondary and I'm working on the Primary node. I unmount the fs on top of the resource.

2. I run: e2fsck -f /dev/drbd12
which completes without any problems.

3. I run: drbdadm secondary res0
on the Primary.
The cstate is now Secondary/Secondary

4. I run: drbdadm down res0
on both nodes. Now the resource is fully down on both nodes.

5. I destroy the backing device on the node which is normally Secondary (wish I hadn't now but it was a calculated risk that was part of my original plan - heh). Now only one node exists.

6. I run: drbdadm up res0
on the remaining node. The resource comes up fine, it is now Secondary, UpToDate but disconnected. The fs on top is NOT mounted.

(Oops, later I realise I forgot to make the resource Primary, so it stays Secondary. But not sure if this is a factor at all in what follows.)

7. I extend the resource's backing device:

lvextend -l +11149 /dev/vg02/res0
Size of logical volume vg02/res0 changed from 120.00 GiB (30720 extents) to 163.55 GiB (41869 extents).
  Logical volume vg02/res0 successfully resized.

Oops, I made a mistake in my lvextend, I wanted to extend using extents from a specific PV but forgot to add the PV and extents range on the end of the lvextend command. No problem, I know the PV segments and their extent ranges from before the extend as I ran lvdisplay -m before I did anything. So I can simply delete and re-create the LV using the exact same segments and extent ranges to restore it to how it was (this is much easier than trying to remove specific PV segments from a LV which lvreduce/lvresize cannot do AFAICT, and an approach I've used successfully many times before). However I can't run lvremove on the backing device while it is still held open by drbd so ...

8. I run: drbdadm down res0
which completes successfully.

Later I look back at the kernel log and see that when I downed the resource in this step, the log is normal and as you would expect, apart from this 1 line:

drbd xtra3/0 drbd12: ASSERTION drbd_md_ss(device->ldev) == device->ldev->md.md_offset FAILED in drbd_md_write

A very obvious sign something is very wrong at this point, but of course I don't see that until later when doing my post mortem :-)

Apart from that 1 line all other lines logged when downing the device are normal. I've pasted the full log at the bottom of this mail.

9. With the resource down, I now am able to successfully delete and re-create the backing device. These are my commands:

lvremove /dev/vg02/res0
Do you really want to remove active logical volume vg02/res0? [y/n]: y
  Logical volume "res0" successfully removed

lvcreate -l 30720 -n res0 vg02 /dev/sdf1:36296-59597 /dev/sdf1:7797-12998 /dev/sdf1:27207-29422 WARNING: ext4 signature detected on /dev/vg02/res0 at offset 1080. Wipe it? [y/n]: n
  Aborted wiping of ext4.
  1 existing signature left on the device.
  Logical volume "res0" created.

I run lvdisplay -m to confirm the list of segments and extent ranges is exactly as it was before I extended the LV in step 7, which it is.

Wonderful, I continue under my delusion that this is all going fine so far :-)

10. I run: drbdadm up res0
and then:
drbdadm primary res0

which both complete successfully. I've pasted the full kernel log from these 2 commands at the end in full.

11. I run: drbdadm primary res0
which completes successfully. All looks fine to me at this point.

12. I decide to run a forced e2fsck on the resource to double check all is OK. Here is now the first time I realise something has gone badly wrong:

e2fsck -f /dev/drbd12
e2fsck 1.45.3 (14-Jul-2019)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
Pass 1: Checking inodes, blocks, and sizes
Inode 262145 seems to contain garbage.  Clear<y>?
/dev/drbd12: e2fsck canceled.

/dev/drbd12: ***** FILE SYSTEM WAS MODIFIED *****

I Ctrl-C as soon as it asked me to clear that inode, but too late fs has been modified in some way.

I run dumpe2fs which dumps a full set of info on the fs which all looks very normal apart from the line:

Filesystem state:         not clean

13. I run e2fsck again this time with -n (which I should have done first time round silly me). I starts out:

e2fsck -f -n /dev/drbd12
e2fsck 1.45.3 (14-Jul-2019)
Pass 1: Checking inodes, blocks, and sizes
Inode 262145 seems to contain garbage.  Clear? no

Inode 262146 passes checks, but checksum does not match inode.  Fix? no
<...repeat same line for hundreds of inodes...>


So, obviously I made a big mistake when I extended the backing device (while it was Secondary though not sure if that is a factor) and then immediately brought down the resource without following through and running "drbdadm resize", and the drbd ASSERTION shows that.

However, at the time I wasn't expecting any problems because I had not mounted the resource. So, in my mind, as I had not made any changes to any filesystem data, I thought that would be fine.

Obviously, what happened is the backing device *size* changed while the resource was live. This should be fine in itself as it is what normally happens in a drbd online resize.

However, by then bringing down the resource, looks like drbd has probably tried to write its metadata at the end of the backing device and this fails because of the size change?

Ultimately all my fault of course for not following through the online resize.

However, now I want to try and recover my fs and I just want to understand what exactly happened with the metadata when the resource was brought down and there was the ASSERTION?

I want to understand that before I try and run e2fsck without -n, which I've seen is going to make drastic changes to the fs, which may recover the fs, but may also make things worse at this point.

Perhaps I should just wipe-md and try and access the fs directly on the backing device.

By the way, I do take a daily backup of the data on almost all my other drbd resources. Just not this one unfortunately, the one I happen to mess up ... :-)

Thanks in advance,
Eddie


==== Full kernel log from Step 8 =====

Jul 28 18:18:52 node1 kernel: [384749.421172] drbd res0 tv: conn( Connecting -> Disconnecting ) Jul 28 18:18:52 node1 kernel: [384749.421217] drbd res0 tv: Restarting sender thread Jul 28 18:18:52 node1 kernel: [384749.421230] drbd res0/0 drbd12: ASSERTION drbd_md_ss(device->ldev) == device->ldev->md.md_offset FAILED in drbd_md_write Jul 28 18:18:52 node1 kernel: [384749.421309] drbd res0 tv: Connection closed Jul 28 18:18:52 node1 kernel: [384749.421318] drbd res0 tv: conn( Disconnecting -> StandAlone ) Jul 28 18:18:52 node1 kernel: [384749.421326] drbd res0 tv: Terminating receiver thread Jul 28 18:18:52 node1 kernel: [384749.421341] drbd res0 tv: Terminating sender thread Jul 28 18:18:52 node1 kernel: [384749.421435] drbd res0/0 drbd12: disk( UpToDate -> Detaching ) Jul 28 18:18:52 node1 kernel: [384749.421497] drbd res0/0 drbd12: disk( Detaching -> Diskless ) Jul 28 18:18:52 node1 kernel: [384749.421908] drbd res0/0 drbd12: drbd_bm_resize called with capacity == 0 Jul 28 18:18:52 node1 kernel: [384749.454400] drbd res0: Terminating worker thread

==== Full kernel log from Steps 10 and 11 =====

Jul 28 18:24:51 node1 kernel: [385108.607668] drbd res0: Starting worker thread (from drbdsetup [26084]) Jul 28 18:24:51 node1 kernel: [385108.610730] drbd res0 tv: Starting sender thread (from drbdsetup [26088]) Jul 28 18:24:51 node1 kernel: [385108.790058] drbd res0/0 drbd12: meta-data IO uses: blk-bio Jul 28 18:24:51 node1 kernel: [385108.790213] drbd res0/0 drbd12: disk( Diskless -> Attaching ) Jul 28 18:24:51 node1 kernel: [385108.790220] drbd res0/0 drbd12: Maximum number of peer devices = 1 Jul 28 18:24:51 node1 kernel: [385108.790623] drbd res0: Method to ensure write ordering: drain Jul 28 18:24:51 node1 kernel: [385108.790633] drbd res0/0 drbd12: drbd_bm_resize called with capacity == 251650488 Jul 28 18:24:51 node1 kernel: [385108.793611] drbd res0/0 drbd12: resync bitmap: bits=31456311 words=491505 pages=960 Jul 28 18:24:51 node1 kernel: [385108.793616] drbd res0/0 drbd12: size = 120 GB (125825244 KB) Jul 28 18:24:51 node1 kernel: [385108.793617] drbd res0/0 drbd12: size = 120 GB (125825244 KB) Jul 28 18:24:51 node1 kernel: [385108.830061] drbd res0/0 drbd12: recounting of set bits took additional 0ms Jul 28 18:24:51 node1 kernel: [385108.830072] drbd res0/0 drbd12: disk( Attaching -> UpToDate ) Jul 28 18:24:51 node1 kernel: [385108.830073] drbd res0/0 drbd12: attached to current UUID: 51A43103091A9E70 Jul 28 18:24:51 node1 kernel: [385108.831297] drbd res0 tv: conn( StandAlone -> Unconnected ) Jul 28 18:24:51 node1 kernel: [385108.831394] drbd res0 tv: Starting receiver thread (from drbd_w_res0 [26085]) Jul 28 18:24:51 node1 kernel: [385108.831455] drbd res0 tv: conn( Unconnected -> Connecting ) Jul 28 18:25:54 node1 kernel: [385172.119565] drbd res0: Preparing cluster-wide state change 840522528 (0->-1 3/1) Jul 28 18:25:54 node1 kernel: [385172.119569] drbd res0: Committing cluster-wide state change 840522528 (0ms) Jul 28 18:25:54 node1 kernel: [385172.119572] drbd res0: role( Secondary -> Primary ) Jul 28 18:25:59 node1 kernel: [385176.861534] drbd res0/0 drbd12: new current UUID: 8BE37DA3CE798447 weak: FFFFFFFFFFFFFFFE



_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to