I’ve been running nova-network in VLAN mode as an active/passive cluster 
resource (corosync + rgmanager) on my OpenStack Havana and Folsom controller 
pairs for a good long while.  This week I found an oddity that I hadn’t noticed 
before, and I’d like to ask the community about it.

When nova-network starts up, it of course launches a dnsmasq process for each 
network, which listens on the .1 address of the assigned network and acts as 
the gateway for that network.   When the nova-network service is moved to the 
passive node, nova-network starts up dnsmasq processes on that node as well, 
again listening on the .1 addresses.   However, since now both nodes have the 
.1 addresses configured, they basically take turns ARPing for the addresses and 
stealing the traffic from each other.  VMs will route through the “active” node 
for a minute or so and then suddenly start routing through the “passive” node.  
Then the cycle repeats.   Among other things, this results in only one 
controller at a time being able to reach the VMs and adds latency to VM traffic 
when the shift happens.

To stop this, I had to manually remove the VLAN interfaces from the bridges, 
bring down the bridges, then delete the bridges from the now-passive node.  
Things then returned to normal, with all traffic flowing through the “active” 
controller and both controllers being able to reach the VMs.

I have not seen anything in the HA guides about how people are preventing this 
situation from occuring - nothing about killing off dnsmasq or tearing down 
these network interfaces to prevent the ARP wars.  Anybody else out there 
experienced this?   How are people handling the situation?

I am considering bringing up arptables to block ARP for the gateway addresses 
when cluster failover happens, or alternatively automating the tear-down of 
these gateway addresses.  Am I missing something here?

Thanks,

Mike Smith
Principal Engineer, Website Systems
Overstock.com





________________________________

CONFIDENTIALITY NOTICE: This message is intended only for the use and review of 
the individual or entity to which it is addressed and may contain information 
that is privileged and confidential. If the reader of this message is not the 
intended recipient, or the employee or agent responsible for delivering the 
message solely to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly 
prohibited. If you have received this communication in error, please notify 
sender immediately by telephone or return email. Thank you.
_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to