Hi, Is there any new magic that I'm unaware of that needs to be added to a pacemaker cluster using a DRBD nested setup? pacemaker 2.0.x and DRBD 8.4.10 on Debian/Buster on a 2-node cluster with stonith. Eventually this will host a bunch of Xen VMs.
I had this sort of thing running for years with pacemaker 1.x DRBD 8.4.x without an itch and now with pacemaker 2.0 and drbd 8.4.10 it gives me errors on trying to start the volume group vg0 on this chain: (VG) (LV) (PV) (VG) vmspace ----> xen_lv0 ----> drbd0 ----> vg0 Only drbd0 and after are managed by pacemaker. Here's what I have configured so far (stonith is configured but is not shown below): --- primitive p_lvm_vg0 ocf:heartbeat:LVM \ params volgrpname=vg0 \ op monitor timeout=30s interval=10s \ op_params interval=10s primitive resDRBDr0 ocf:linbit:drbd \ params drbd_resource=r0 \ op start interval=0 timeout=240s \ op stop interval=0 timeout=100s \ op monitor interval=29s role=Master timeout=240s \ op monitor interval=31s role=Slave timeout=240s \ meta migration-threshold=3 failure-timeout=120s ms ms_drbd_r0 resDRBDr0 \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true colocation c_lvm_vg0_on_drbd_r0 inf: p_lvm_vg0 ms_drbd_r0:Master order o_drbd_r0_before_lvm_vg0 Mandatory: ms_drbd_r0:promote p_lvm_vg0:start --- /etc/lvm/lvm.conf has global_filter set to: global_filter = [ "a|/dev/drbd.*|", "a|/dev/md.*|", "a|/dev/md/.*|", "r|.*|" ] But I'm note sure if its sufficient. I seem to be missing some crucial ingredient. syslog on the DC shows the following when trying to start vg0: Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Activating volume group vg0 Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: Reading all physical volumes. This may take a while... Found volume group "vmspace" using metadata type lvm2 Found volume group "freespace" using metadata type lvm2 Found volume group "vg0" using metadata type lvm2 Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: INFO: 0 logical volume(s) in volume group "vg0" now active Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM Volume vg0 is not available (stopped) Oct 28 14:42:56 node2 LVM(p_lvm_vg0)[8775]: ERROR: LVM: vg0 did not activate correctly Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ Configuration node global/use_lvmetad not found ] Oct 28 14:42:56 node2 pacemaker-execd[27054]: notice: p_lvm_vg0_start_0:8775:stderr [ ocf-exit-reason:LVM: vg0 did not activate correctly ] Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Result of start operation for p_lvm_vg0 on node2: 7 (not running) Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: node2-p_lvm_vg0_start_0:77 [ Configuration node global/use_lvmetad not found\nocf-exit-reason:LVM: vg0 did not activate correctly\n ] Oct 28 14:42:56 node2 pacemaker-controld[27057]: warning: Action 42 (p_lvm_vg0_start_0) on node2 failed (target: 0 vs. rc: 7): Error Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 aborted by operation p_lvm_vg0_start_0 'modify' on node2: Event failed Oct 28 14:42:56 node2 pacemaker-controld[27057]: notice: Transition 602 (Complete=28, Pending=0, Fired=0, Skipped=0, Incomplete=1, Source=/var/lib/pacemaker/pengine/pe-input-39.bz2): Complete Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000) Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: * Recover p_lvm_vg0 ( node2 ) Oct 28 14:42:56 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 603, saving inputs in /var/lib/pacemaker/pengine/pe-input-40.bz2 Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: On loss of quorum: Ignore Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node2: not running Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Processing failed start of p_lvm_vg0 on node1: not running Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node2 after 1000000 failures (max=1000000) Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: warning: Forcing p_lvm_vg0 away from node1 after 1000000 failures (max=1000000) Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: * Stop p_lvm_vg0 ( node2 ) due to node availability Oct 28 14:42:57 node2 pacemaker-schedulerd[27056]: notice: Calculated transition 604, saving inputs in /var/lib/pacemaker/pengine/pe-input-41.bz2 Oct 28 14:42:57 node2 pacemaker-controld[27057]: notice: Initiating stop operation p_lvm_vg0_stop_0 locally on node2 Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: Deactivating volume group vg0 Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: 0 logical volume(s) in volume group "vg0" now active Oct 28 14:42:57 node2 LVM(p_lvm_vg0)[8881]: INFO: LVM Volume vg0 is not available (stopped) Any help gratefully accepted! jf _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/