Re: [Linux-HA] Pacemaker & AWS elastic IPs

Andrew Miklas Fri, 26 Nov 2010 00:36:58 -0800

Hi,

On 25-Nov-10, at 11:37 AM, Andrew Beekhof wrote:


> Given what you've described, you could probably remove the while loop
> during stop.
> It should be safe because Amazon is ensuring that it will only "run"
> in exactly one location.

I'll give that a try -- thanks.


I noticed something else interesting during my testing today -- I'm  
curious if it's related to my testing method or is a sign of a  
configuration error.  To test Pacemaker's response to a node failure,  
I usually use iptables to cut off all network traffic from one node to  
the rest of the cluster.  (I'm doing this instead of the typical  
"unplug the network line" method because I don't have physical access  
to the machines).

For example, I would run this on node test2 of a 3 node test  
environment:
"iptables -A INPUT -s test1 -j DROP; iptables -A INPUT -s test3 -j  
DROP; iptables -A OUTPUT -d test1 -j DROP; iptables -A OUTPUT -d test3  
-j DROP"

As expected, Pacemaker detects the node failure and starts up all the  
resources that were running on that node elsewhere.  However, when I  
remove the rules with "iptables -F", there if a brief period where  
Pacemaker (or Heartbeat, I suppose) becomes very confused as to which  
nodes are up and which are down.  For example, crm_mon will suddenly  
indicate that test3 is offline, and then show that it is back online  
ten seconds later, even though test3 was always part of the partition  
that had quorum.

The problem here is that these spurious node failures cause Pacemaker  
to initiate unnecessary resource migrations.  Is it normal for the  
cluster to become confused for a while when the network connection to  
a node is suddenly restored?  Or is this happening because using  
iptables is not a fair test of how the system will respond during a  
network split?


Thanks,


Andrew

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Pacemaker & AWS elastic IPs

Reply via email to