On 18/04/17 15:02, Digimer wrote: > On 18/04/17 10:00 AM, Digimer wrote: >> On 18/04/17 03:47 AM, Ulrich Windl wrote: >>>>>> Digimer <li...@alteeve.ca> schrieb am 16.04.2017 um 20:17 in Nachricht >>> <12cde13f-8bad-a2f1-6834-960ff3afc...@alteeve.ca>: >>>> On 16/04/17 01:53 PM, Eric Robinson wrote: >>>>> I was reading in "Clusters from Scratch" where Beekhof states, "Some would >>> >>>> argue that two-node clusters are always pointless, but that is an argument >>>> for another time." Is there a page or thread where this argument has been >>>> fleshed out? Most of my dozen clusters are 2 nodes. I hate to think they're >>> >>>> pointless. >>>>> >>>>> -- >>>>> Eric Robinson >>>> >>>> There is a belief that you can't build a reliable cluster without >>>> quorum. I am of the mind that you *can* build a very reliable 2-node >>>> cluster. In fact, every cluster our company has deployed, going back >>>> over five years, has been 2-node and have had exception uptimes. >>>> >>>> The confusion comes from the belief that quorum is required and stonith >>>> is option. The reality is the opposite. I'll come back to this in a minute. >>>> >>>> In a two-node cluster, you have two concerns; >>>> >>>> 1. If communication between the nodes fail, but both nodes are alive, >>>> how do you avoid a split brain? >>> >>> By killing one of the two parties. >>> >>>> >>>> 2. If you have a two node cluster and enable cluster startup on boot, >>>> how do you avoid a fence loop? >>> >>> I think the problem in the question is using "you" instead of "it" ;-) >>> Pacemaker assumes all problems that cause STONITH will be solved by STONITH. >>> That's not always true (e.g. configuration errors). Maybe a node's failcount >>> should not be reset if the node was fenced. >>> So you'll avoid a fencing loop, but might end in a state where no resources >>> are running. IMHO I'd prefer that over a fencing loop. >>> >>>> >>>> Many answer #1 by saying "you need a quorum node to break the tie". In >>>> some cases, this works, but only when all nodes are behaving in a >>>> predictable manner. >>> >>> All software relies on the fact that it behaves in a predictable manner, >>> BTW. >>> The problem is not "the predictable manner for all nodes", but the >>> predictable >>> manner for the cluster. >>> >>>> >>>> Many answer #2 by saying "well, with three nodes, if a node boots and >>>> can't talk to either other node, it is inquorate and won't do anything". >>> >>> "wan't do anything" is also wrong: I must go offline without killing others, >>> preferrably. >>> >>>> This is a valid mechanism, but it is not the only one. >>>> >>>> So let me answer these from a 2-node perspective; >>>> >>>> 1. You use stonith and the faster node lives, the slower node dies. From >>> >>> Isn't there a possibility that both nodes shoot each other? Is there a >>> guarantee that there will always be one faster node? >>> >>>> the moment of comms failure, the cluster blocks (needed with quorum, >>>> too) and doesn't restore operation until the (slower) peer is in a known >>>> state; Off. You can bias this by setting a fence delay against your >>>> preferred node. So say node 1 is the node that normally hosts your >>>> services, then you add 'delay="15"' to node 1's fence method. This tells >>>> node 2 to wait 15 seconds before fencing node 1. If both nodes are >>>> alive, node 2 will be fenced before the timer expires. >>> >>> Can only the DC issue fencing? >>> >>>> >>>> 2. In Corosync v2+, there is a 'wait_for_all' option that tells a node >>>> to not do anything until it is able to talk to the peer node. So in the >>>> case of a fence after a comms break, the node that reboots will come up, >>>> fail to reach the survivor node and do nothing more. Perfect. >>> >>> Does "do nothing more" mean continuously polling for other nodes? >>> >>>> >>>> Now let me come back to quorum vs. stonith; >>>> >>>> Said simply; Quorum is a tool for when everything is working. Fencing is >>>> a tool for when things go wrong. >>> >>> I'd say: Quorum is the tool to decide who'll be alive and who's going to >>> die, >>> and STONITH is the tool to make nodes die. If everything is working you need >>> neither quorum nor STONITH. >>> >>>> >>>> Lets assume that your cluster is working find, then for whatever reason, >>>> node 1 hangs hard. At the time of the freeze, it was hosting a virtual >>>> IP and an NFS service. Node 2 declares node 1 lost after a period of >>>> time and decides it needs to take over; >>> >>> In case node 1 is DC, isn't a selection for a new DC coming first, and the >>> new >>> DC doing the STONITH? >>> >>> >>>> >>>> In the 3-node scenario, without stonith, node 2 reforms a cluster with >>>> node 3 (quorum node), decides that it is quorate, starts its NFS server >>>> and takes over the virtual IP. So far, so good... Until node 1 comes out >>> >>> Again if node 1 was DC, it's not that simple. >>> >>>> of its hang. At that moment, node 1 has no idea time has passed. It has >>> >>> You assume no fencing was done... >>> >>>> no reason to think "am I still quorate? Are my locks still valid?" It >>>> just finishes whatever it was in the middle of doing and bam, >>>> split-brain. At the least, you have two nodes claiming the same IP at >>>> the same time. At worse, you had uncoordinated writes to shared storage >>>> and you've corrupted your data. >>> >>> But that's no cluster; that's a mess ;-) >>> >>>> >>>> In the 2-node scenario, with stonith, node 2 is always quorate, so after >>>> declaring node 1 lost, it moves to fence node 1. Once node 1 is fenced, >>>> *then* it starts NFS, takes over the virtual IP and restores services. >>> >>> So you compare "2 nodes + fencing" to "3 nodes without fencing"? >>> >>>> In this case, no split-brain is possible because node 1 has rebooted and >>>> comes up with a fresh state (or it's on fire and never coming back anyway). >>>> >>>> This is why quorum is optional and stonith/fencing is not. >>> >>> You did not convince me how only one node has the ability to fence the other >>> without a quorum: Wouldn't both nodes shoot at each other? (I quoted this so >>> many times, but once again: In HP-UX Service Guard, a lock disk was used as >>> a >>> tie-breaker: Only one node suceeded to get the lock, and the other committed >>> suicide (via kernel watchdog timeout)). >>> >>>> >>>> Now, with this said, I won't say that 3+ node clusters are bad. They're >>>> fine if they suit your use-case, but even with 3+ nodes you still must >>>> use stonith. >>>> >>>> My *personal* arguments in favour of 2-node clusters over 3+ nodes is this; >>> >>> Again: You compare "2 nodes with fencing" to "3 nodes without fencing". My >>> personal vote would be "3 nodes with fencing" if there is enough work for >>> two >>> nodes. >>> >>>> >>>> A cluster is not beautiful when there is nothing left to add. It is >>>> beautiful when there is nothing left to take away. >>>> >>>> In availability clustering, nothing should ever be more important than >>>> availability, and availability is a product of simplicity. So in my >>>> view, a 3-node cluster adds complexity that is avoidable, and so is >>>> sub-optimal. >>> >>> IMHO: a valid cluster software works starting at 1 node, then per induction >>> also for n+1 nodes. Complexity should grow only linear with the number of >>> nodes. Of course you shouldn't add nodes just for the number of nodes, but >>> for >>> the actual need. >>> >>> Regards, >>> Ulrich >> >> I was addressing the misconception that fencing was optional and quorum >> was not. I wrote a longer reply as an article to follow up on this down >> the thread. > > As an addendum; I will say, with clarity, that *all* clusters need > stonith, period. 3+ node without stonith is still a disaster waiting to > happen. >
#1 rule of clustering - if we sold clusters in boxes it would be on the packing tape. chrissie _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org