Thanks for your reply. First of all I didn't get if the VIP will migrate if Tomcat or load balancer also fails. It will right? Also if I understand this correctly, I can end up with VIP on both nodes if corosync fails due to network failure. And you suggest redundant communication paths to avoid this. But if I understand the problem, if the VIP runs in my linux-1 and pacemaker is somehow via corosync ready to take over on failure from linux-2, if there is a network failure (despite redundant communication paths, unless you guys recommend some specific topology to the people using Pacemaker that you are 100% full proof) how can you detect if the other node is actually crashed or just corosync fails? In this case won't the linux-2 also "wakeup" to take VIP? Could you please help me understand this?
Thank you! ________________________________ From: David Coulson <[email protected]> To: Hermes Flying <[email protected]>; General Linux-HA mailing list <[email protected]> Cc: Digimer <[email protected]> Sent: Saturday, December 1, 2012 2:46 PM Subject: Re: [Linux-HA] Some help on understanding how HA issues are addressed by pacemaker On 12/1/12 5:46 AM, Hermes Flying wrote: > Thank you for this! > > One last thing I need to clear out before digging into your configuration > specs etc. > Since the pacemaker is a fail-over system rather than a load-balancing system > (like Red Hat) as you say, my understanding is that one of my nodes will have > the VIP until: > 1) Tomcat crashes and can not restart (dead for some reason) --> Pacemaker > migrates VIP > > 2) The network communication with the outside network is cut off. --> > Pacemaker migrates VIP > > If these (2) are valid (are they?) then that means that there is no > primary/backup concept using pacemaker (since I will assign to one of my > nodes to have the VIP and my installed Load Balancer will distribute the load > among my 2 Tomcats) and as a result there can not be a split-brain. In the event of a split brain with Pacemaker, and you don't have any fencing configured, you will end up with your VIP running on both systems. Chances in your configuration it won't be a big deal since your router/firewall/whatever will learn the ARP of one system, so you'll end up routing traffic properly - But it will be unpredictable, and difficult to troubleshoot. > > Yet you imply that split-brain can occur even with Pacemaker if I don't have > fencing properly set. > But how? Since it seems to me that Pacemaker does not have a notion of > primary/backup. Or you mean something else with "fail-over" system? For each resource, Pacemaker knows there is a node where it is running, and 'other' nodes where it is not running (but could if the node running it failed). So from a resource perceptive, there is an active node and one or more backups. > > Additionally you say that the "coordination" of Pacemaker instances is done > via corosync which is over network messages right? > So what happens in the event of communication/network failure but only in the > communication paths used for corosync coordination and not the communication > path with the clients? Hope this question makes sense as I am new in your > facilities. Split brain. That's why you need a redundant communications network, plus you need fencing. _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
