On Tue, Sep 20, 2016 at 12:25:55PM +0000, Auer, Jens wrote:
> Hi,
> > Don't disable fencing!
> > You need to configure and test stonith in pacemaker. Once that's
> > working, then you set DRBD's fencing to 'resource-and-stonith;' and
> > configure the 'crm-{un,}fence-handler.sh' un/fence handlers.
> > With this, if a node fails (and no, redundant network links is not
> > enough, nodes can die in many ways), then drbd will block when the peer
> > is lost, call the fence handler and wait for pacemaker to report back
> > that the fence action was completed. This way, you will never get a
> > split-brain and you will get reliable recovery.
> While we will configure fencing finally (and I know that nodes can
> fail in many ways), it should not be influence the test I am doing
> because the nodes are not on any unknown state. I have three
> independant network connections, one for DRBD, one for corosync
> heartbeats and one for data. In the test, I stop the cluster node
> manually with 'pcs cluster stop'. I don't think this should trigger
> STONITH or fencing, but the DRBD fails to get promoted permanently.

The fencing constraint has been created at some point in time,
probably correctly.

But apparently it has never been removed, possibly for good reasons,
possibly by accident (not enough information to guess that).

The fencing constraint is supposed to be removed
once that drbd resource is fully synced up again.

Go over your logs, find the invocation of the "unfence",
and figure out why it did not work at that time.

: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD® and LINBIT® are registered trademarks of LINBIT
please don't Cc me, but send to list -- I'm subscribed
drbd-user mailing list

Reply via email to