Le 24/06/2015 18:29, Ken Gaillot a écrit : > On 06/24/2015 10:58 AM, Mathieu Valois wrote: >> Hi everybody, >> I'm working with Pacemaker and Stonith for High-Availability with >> 2-nodes cluster (called here A and B). Both nodes have one IPMI as fence >> device. >> >> The deal is : >> >> * A is currently running resources >> * B is in passive mode >> >> Then I plug off the supply of the A node. So every eth interfaces AND >> IPMI on A are unavailable. Here comes the trick : B tries unsuccessfully >> to bring A down, cause A's IPMI is unreachable. When N attempts have >> been done, B gives up and brings itself to "Block" state (called IDLE in >> the log file). > The behavior you describe is exactly what's intended. Since B can't > *confirm* that A is down, it can't run resources without risking a > split-brain situation. > >> Here is my question : how can I force B to bring back resources even if >> Stonith A fails ? > IPMI is not sufficient to be used as the only fence device. The > preferred solution is to create a fencing topology with the IPMI as the > first level, and a different fencing device (such as intelligent power > strip) as the second level. > >> I understand the consequences (concurrent writes, etc ...), but I rather >> like these compared to a service unavailable at all. >> >> Thanks for the help :) > And here you get into perhaps the biggest recurring controversy in high > availability. :) Depending on your resources, a split-brain situation > might corrupt or lose some or all of your data. Silent corruption can be > worse, you might have bad data and not even know it. I can't afford getting another fencing device. I'm forced to do this way. I've heard about quorum disk to manage split-brain issue. Could it be used in such a case with only one IPMI device for each node ? What does it involve ? > > The consensus of HA professionals is that your data is not "available" > if it is corrupted, so proper fencing is a necessity. > > That said, some people do drive without their seat belts on :) so it is > possible to do what your describe. Dummy/null fence agents can always > return success. It's playing Russian roulette with your data though. > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org Once again, thanks a lot for your quick and detailed answer :)
---- Mathieu Valois _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org