[ClusterLabs] Antw: SBD & Failed Peer

Ulrich Windl Sun, 06 Sep 2015 23:17:42 -0700

>>> Jorge Fábregas <[email protected]> schrieb am 06.09.2015 um 22:23
in
Nachricht <[email protected]>:
> Hi,
> 
> I was reading one of the latest posts [1] from Andrew Beekhof on SBD and
> got me into thinking...
> 
> Assume an active/active cluster using OCFS2 and SBD with shared storage.
> Then one node explodes (the hardware watchdog is gone as well
> obviously).  At this point my guess is that the remaining node will
> notice that its partner hasn't updated its mailbox slot on the SBD
> shared-storage.
> 
> My question:  Is this enough proof (confirmation) that the other node
> isn't capable of causing corruption? And so...will DLM/OCFS2 resume
> operation?


IMHO it will wor differently: If the node goes down, the network layer
(corosync) will notice that (sooner or later depending on some settings). The a
remaining node will try a fencing operation. After some time (also
configurable) the remaining nodes will assume the other node was fenced
successfully. I doesn not mean that anything actually happened, but that's the
way it's designed. You'll have to make sure things work as configured.

> 
> Thanks,
> Jorge
> 
> [1]: http://blog.clusterlabs.org/blog/2015/sbd-fun-and-profit/ 
> 
> _______________________________________________
> Users mailing list: [email protected] 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[ClusterLabs] Antw: SBD & Failed Peer

Reply via email to