On 03/24/2014 11:31 AM, Chris Friesen wrote:

It looks like we're raising

RecoverableConnectionError: connection already closed

down in /usr/lib64/python2.7/site-packages/amqp/abstract_channel.py, but
nothing handles it.

It looks like the most likely place that should be handling it is
nova.openstack.common.rpc.impl_kombu.Connection.ensure().


In the current oslo.messaging code the ensure() routine explicitly
handles connection errors (which RecoverableConnectionError is) and
socket timeouts--the ensure() routine in Havana doesn't do this.

I misread the code, ensure() in Havana does in fact monitor socket timeouts, but it doesn't handle connection errors.

It looks like support for handling connection errors was added to oslo.messaging just recently in git commit 0400cbf. The git commit comment talks about clustered rabbit nodes and mirrored queues which doesn't apply to our scenario, but I suspect it would probably fix the problem that we're seeing as well.

Chris

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to