On 05/28/2018 09:43 AM, Klaus Wenninger wrote: > On 05/26/2018 07:23 AM, Andrei Borzenkov wrote: >> 25.05.2018 14:44, Klaus Wenninger пишет: >>> On 05/25/2018 12:44 PM, Andrei Borzenkov wrote: >>>> On Fri, May 25, 2018 at 10:08 AM, Klaus Wenninger <kwenn...@redhat.com> >>>> wrote: >>>>> On 05/25/2018 07:31 AM, 井上 和徳 wrote: >>>>>> Hi, >>>>>> >>>>>> I am checking the watchdog function of SBD (without shared block-device). >>>>>> In a two-node cluster, if one cluster is stopped, watchdog is triggered >>>>>> on the remaining node. >>>>>> Is this the designed behavior? >>>>> SBD without a shared block-device doesn't really make sense on >>>>> a two-node cluster. >>>>> The basic idea is - e.g. in a case of a networking problem - >>>>> that a cluster splits up in a quorate and a non-quorate partition. >>>>> The quorate partition stays over while SBD guarantees a >>>>> reliable watchdog-based self-fencing of the non-quorate partition >>>>> within a defined timeout. >>>> Does it require no-quorum-policy=suicide or it decides completely >>>> independently? I.e. would it fire also with no-quorum-policy=ignore? >>> Finally it will in any case. But no-quorum-policy decides how >>> long this will take. In case of suicide the inquisitor will immediately >>> stop tickling the watchdog. In all other cases the pacemaker-servant >>> will stop pinging the inquisitor which will makes the servant >>> timeout after a default of 4 seconds and then the inquisitor will >>> stop tickling the watchdog. >>> But that is just relevant if Corosync doesn't have 2-node enabled. >>> See the comment below for that case. >>> >>>>> This idea of course doesn't work with just 2 nodes. >>>>> Taking quorum info from the 2-node feature of corosync (automatically >>>>> switching on wait-for-all) doesn't help in this case but instead >>>>> would lead to split-brain. >>>> So what you are saying is that SBD ignores quorum information from >>>> corosync and takes its own decisions based on pure count of nodes. Do >>>> I understand it correctly? >>> Yes, but that is just true for this case where Corosync has 2-node >>> enabled. >>>> In all other cases (might it be clusters with more than 2 nodes >>> or clusters with just 2 nodes but without 2-node enabled in >>> Corosync) pacemaker-servant takes quorum-info from >>> pacemaker, which will probably come directly from Corosync >>> nowadays. >>> But as said if 2-node is configured with Corosync everything >>> is different: The node-counting is then actually done >>> by the cluster-servant and this one will stop pinging the >>> inquisitor (instead of the pacemaker-servant) if it doesn't >>> count more than 1 node. >>> >> Is it conditional on having no shared device or it just checks two_node >> value? If it always behaves this way, even with real shared device >> present, it means sbd is fundamentally incompatible with two_node and it >> better be mentioned in documentation. > If you are referring to counting the nodes instead of taking > quorum-info from pacemaker in case of 2-node configured > with corosync, that is universal. > > And actually the reason why it is there is to be able to use > sbd with a single disk on 2-nodes having 2-node enabled. > > Imagine quorum-info from corosync/pacemaker being used > in that case: > Image a cluster (node-a & node-b). node-a looses connection > to the network and to the shared storage. node-a will still > receive positive quorum from corosync as it has seen the other > node already (since it is up). This will make it ignore the > loss of the disk (survive on pacemaker). > node-b is quoruate as well, sees the disk, uses the disk to > fence node-a and will after a timeout assume node-a to be > down -> split-brain. Seeing the disk will prevent the reboot if that is what was missing for you. > >>> That all said I've just realized that setting 2-node in Corosync >>> shouldn't really be dangerous anymore although it doesn't make >>> the cluster especially useful either in case of SBD without disk(s). >>> >>> Regards, >>> Klaus > _______________________________________________ > Users mailing list: Users@clusterlabs.org > https://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org
_______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org