On 03/24/2014 10:41 AM, Chris Friesen wrote:

We've been stress-testing our system doing controlled switchover of
the controller.  Normally this works okay, but we've run into a
situation that seems to show a flaw in the reconnection logic.

On the compute node, nova-compute has managed to get into a state
where it shows as "down" in "nova service-list", and the
nova-compute.log seems to show it never managing to reconnect with
the AMQP server.

I've included logs below showing what seems to be the beginning of
the problem and then showing it transitioning to the periodic logs
without a successful reconnection.  The periodic logs have now been
going for roughly seven hours...

Any ideas on what might be going on would be appreciated.


I suppose I should mention that we're running Havana, not the current code.

Chris

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to