Hello Matthew,
Are you sure LVM only uses the DRBD device to write data to and not the
backend disk? We've had this issue in the past and this was caused by
LVM which scans all the devices for PV's, VG's and LV's and sometimes
pick the wrong device. You can fix this by changing the filter in the
lvm.conf file. If you change this, don't forget to remove the LVM cache
file first and then to rescan everything.
It might be another issue of course, but hopefully this helps a bit
working towards the solution to the problem you have :-)
Met vriendelijke groet / Kind regards,
Dick Tump | Shock Media B.V.
Tel: +31 (0)546 - 714360
Fax: +31 (0)546 - 714361
Web: http://www.shockmedia.nl/
Connect to me @ LinkedIn:
http://nl.linkedin.com/in/dicktump
On 19-08-15 17:57, Matthew Vernon wrote:
Hi,
I'm encountering a problem when using LVM & DRBD, the effect of which
is that sometimes one of my nodes cannot see LVM metadata inside a
DRBD device when it becomes Primary[1]. It seems that sometimes the
early part of my DRBD device is not being correctly replicated (the
LVM header is in the first MB). It seems likely that this is a bug in
DRBD, but obviously it's difficult to be entirely sure. I'd be happy
to try running different diagnostics if you think it'd help track down
the problem.
My machines are Intel boxes running Debian Jessie (kernel 3.16.0),
with the DRBD 8.4.6 kernel module, and drbd-utils version 8.9.2~rc1-2.
I have a script that can reproduce this problem; essentially[0] it does:
#on both hosts
drbdadm -- --force create-md mcv21
drbdadm up mcv21
#on host A only
drbdadm wait-connect mcv21
drbdadm new-current-uuid --clear-bitmap minor-7
drbdadm primary mcv21
pvcreate /dev/drbd7
drbdadm secondary mcv21
#on host B only
drbdadm primary mcv21
file -s /dev/drbd7 | grep 'LVM2 PV'
#on both hosts
drbdadm down mcv21
dd if=/dev/zero of=/dev/guests/mwsig-mcv21 bs=1M count=1
The "file" invocation is my diagnostic (it can detect the LVM PV
metadata), and the dd is to make sure the LVM metadata is blanked
before the next iteration.
The failure rate is about 1 in 1000 (an overnight run gave me 12
failures to 12,988 passes).
The resource config (minus IP addresses):
resource mcv21 {
device /dev/drbd7;
disk /dev/guests/mwsig-mcv21;
meta-disk internal;
on ophon {
address [IPv4 here]:7795;
}
on opus {
address [IPv4 here]:7795;
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
}
The only interesting bit of global_common.conf is protocol C; and
allow-two-primaries;
Regards,
Matthew
[0] the actual script is a bit fiddlier, as it has to deal with
systemd-udevd sometimes holding /dev/drbd7 open a bit longer than it
should, and has a pile of other related tests commented out in it. I
can send it if you think it would help
[1] my use case is building Xen guests - each guest has a host LV
allocated, which is the "disk" for a DRBD device; I make that an LVM
pv, which is what becomes the guests' storage
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user