Hi there! I have set up a two-node heartbeat cluster running apache and drbd.
Everthing went fine, till we tested a "split brain" scenario. In this case, when we detach both network cables from one host, we get a two-primary situation. I read in the thread "methods of dealing with network failover" that setting up stonith and a quorum-node might be a good workaround. Well... it isn't in our situation, I think. Let's assume, we have the following scenario: - The two nodes, having two interfaces each, monitor each other via unicast queries over both interfaces. - We do not have any dedicated cross-over or serial connections, because the servers reside in buildings a few kilometers appart. - We have only the two Linux nodes in our network which are part of our cluster (well, a few more to be honest, but those are the two we may fiddle arround with). - We won't be able to set up a (dedicated) quorum server. - We do not have a network enabled power socket we might deactivate for the node which we want to "shot in the head". Now someone stumbles over the network cables of, lets say, node-b, detaching it from the network. node-b and node-a do not receive any unicast replies from their peer anymore, but node-a can still ping it's ping host, while node-a can't. node-b should now assume, that it's very likely dead. node-a should assume can't be sure, because it can't reach it's peer but still can reach the rest of the network (or at least it's ping node). Actually, I'd like to see the following happen: - If a node is secondary and assumes, that it's very likely dead, it should not be allowed to take over any ressources. - If a node is primary and isn't sure about it's peer, it should "freeze" it's state at least till it's peer is reachable over one interface. Can that be done? Without a quorum server? If yes, how? If no, why not? As a workaround I thought about a "virtual ressource", that is in fact a "suicide" or "self stonith"-script. That's still better, than running into an inconsistent state. Any help would be appreciated. Thanks, Andreas -- CONET Solutions GmbH Andreas Stallmann, Senior Berater ----------------------------------- CONET Solutions GmbH, Theodor-Heuss-Allee 19, 53773 Hennef Registergericht/Registration Court: Amtsgericht Siegburg (HRB Nr. 9136) Geschaftsfuhrer/Managing Directors: Dipl.-Inform. Rudiger Zeyen (Sprecher/Chairman), Dipl.-Betriebsw. Wilfried Putz und Dipl.-Inform. Jurgen Zender Vorsitzender des Aufsichtsrates/Chairman of the Supervisory Board: Dipl.-Math. Hans-Jurgen Niemeier _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
