Hi,

We're currently running Grizzly (going to Havana soon) and we're running into an issue where if the active controller is ungracefully killed then nova-compute on the compute node doesn't properly connect to the new rabbitmq server on the newly-active controller node.

I saw a bugfix in Folsom (https://bugs.launchpad.net/nova/+bug/718869) to retry the connection to rabbitmq if it's lost, but it doesn't seem to be properly handling this case.

Interestingly, killing and restarting nova-compute on the compute node seems to work, which implies that the retry code is doing something less effective than the initial startup.

Has anyone doing HA controller setups run into something similar?

Chris

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to