On 03/24/2014 11:31 AM, Chris Friesen wrote:
It looks like we're raising
RecoverableConnectionError: connection already closed
down in /usr/lib64/python2.7/site-packages/amqp/abstract_channel.py, but
nothing handles it.
It looks like the most likely place that should be handling it is
nova.openstack.common.rpc.impl_kombu.Connection.ensure().
In the current oslo.messaging code the ensure() routine explicitly
handles connection errors (which RecoverableConnectionError is) and
socket timeouts--the ensure() routine in Havana doesn't do this.
I misread the code, ensure() in Havana does in fact monitor socket
timeouts, but it doesn't handle connection errors.
It looks like support for handling connection errors was added to
oslo.messaging just recently in git commit 0400cbf. The git commit
comment talks about clustered rabbit nodes and mirrored queues which
doesn't apply to our scenario, but I suspect it would probably fix the
problem that we're seeing as well.
Chris
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev