Re: [DRBD-user] Impossible to get primary node.

Rob Kramer Wed, 09 Oct 2019 03:41:06 -0700

Thanks a lot for your suggestions (Robert and Lars), it took a whilebefore I was able to try them on virtual machines. I hope you don't mindthat I reply to both of you in one mail -- I messed up my mail deliveryoptions (now corrected).


I've added my latest drbd config below, for reference.

I can't find any sequence of commands that can convince drbd (or pacemaker) 
that I *want* to use outdated data.

This should work:
drbdadm del-peer tapas:fims1
drbdadm primary —force tapas

This seems to work (get to UpToDate state) briefly, until the next timethe pacemaker drbd monitor runs, which 'demotes' the resource again toit's original state.


Failed Resource Actions:

* drbd_monitor_20000 on vmnbiaas2 'master' (8): call=84,status=complete, exitreason='',

    last-rc-change='Wed Oct  9 14:49:06 2019', queued=0ms, exec=0ms

The corosync logs are difficult to follow, so I'm not sure how I can getpacemaker to accept the trickery done behind its back..


Lars wrote:

Alternatively, you could *add* a suitable fencing constraint to your sole 
survivor node, which should make the fencing succeed.

You could tell the crm-fence-peer.9.sh fencing handler that an 
--unreachable-peer-is-outdated.
(Manually. From a root shell. That switch is not effective from within the drbd 
configuration; for reasons).

I tried this, after finding what the command should look like in/var/log/messages:

DRBD_BACKING_DEV_0=/dev/mapper/centos-drbd DRBD_CONF=/etc/drbd.confDRBD_LL_DISK=/dev/mapper/centos-drbd DRBD_MINOR=0 DRBD_MINOR_0=0DRBD_MY_ADDRESS=172.17.5.62 DRBD_MY_AF=ipv4 DRBD_MY_NODE_ID=1DRBD_NODE_ID_0=vmnbiaas1 DRBD_NODE_ID_1=vmnbiaas2DRBD_PEER_ADDRESS=172.17.5.61 DRBD_PEER_AF=ipv4 DRBD_PEER_NODE_ID=0DRBD_RESOURCE=tapas DRBD_VOLUME=0 UP_TO_DATE_NODES=0x00000002/usr/lib/drbd/crm-fence-peer.9.sh --unreachable-peer-is-outdated


This failed as follows:

Oct 9 14:29:42 vmnbiaas2 crm-fence-peer.9.sh[6153]: WARNING Found <cibcrm_feature_set="3.0.14" validate-with="pacemaker-2.10" epoch="48"num_updates="23" admin_epoch="0" cib-last-written="Wed Oct 9 14:07:222019" update-origin="vmnbiaas1" update-client="cibadmin"update-user="root" have-quorum="0" dc-uuid="1"

Oct 9 14:29:42 vmnbiaas2 crm-fence-peer.9.sh[6153]: WARNING I don'thave quorum; did not place the constraint!


OK, while I'm experimenting, I quick-hacked the script to use

fail_if_no_quorum=false

After which the error changes to

Oct 9 14:38:13 vmnbiaas2 crm-fence-peer.9.sh[7579]: WARNING some peeris UNCLEAN, my disk is not UpToDate, did not place the constraint!


Cheers!

     Rob


resource tapas {
  protocol C;

  startup {
    wfc-timeout            0;    ## Infinite!
    outdated-wfc-timeout    120;
    degr-wfc-timeout        120;  ## 2 minutes.
  }

  disk {
    on-io-error detach;
  }

  handlers {
    split-brain "/opt/sol/tapas/bin/split-brain-helper.sh";

    fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
    unfence-peer "/usr/lib/drbd/crm-unfence-peer.9.sh";
  }

  net {
    fencing resource-only;

#    after-sb-0pri       discard-least-changes;
  }

  device        /dev/drbd0;
  disk            /dev/mapper/centos-drbd;
  meta-disk        internal;

  on vmnbiaas1 {
    address    172.17.5.61:7789;
  }

  on vmnbiaas2 {
    address    172.17.5.62:7789;
  }
}

_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
https://lists.linbit.com/mailman/listinfo/drbd-user

Re: [DRBD-user] Impossible to get primary node.

Reply via email to