On 03/24/2014 10:41 AM, Chris Friesen wrote:
We've been stress-testing our system doing controlled switchover of the controller. Normally this works okay, but we've run into a situation that seems to show a flaw in the reconnection logic. On the compute node, nova-compute has managed to get into a state where it shows as "down" in "nova service-list", and the nova-compute.log seems to show it never managing to reconnect with the AMQP server. I've included logs below showing what seems to be the beginning of the problem and then showing it transitioning to the periodic logs without a successful reconnection. The periodic logs have now been going for roughly seven hours... Any ideas on what might be going on would be appreciated.
I suppose I should mention that we're running Havana, not the current code. Chris _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
