** Description changed:

  In the logs the first traceback that happen is this:
  
  [-] Unexpected exception occurred 1 time(s)... retrying.
  Traceback (most recent call last):
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/excutils.py",
 line 62, in inner_func
-     return infunc(*args, **kwargs)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 741, in _consumer_thread
-      
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 732, in consume
-     @excutils.forever_retry_uncaught_exceptions
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 660, in iterconsume
-     try:
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 590, in ensure
-     def close(self):
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 531, in reconnect
-     # to return an error not covered by its transport
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 513, in _connect
-     Will retry up to self.max_retries number of times.
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 150, in reconnect
-     use the callback passed during __init__()
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 508, in declare
-     self.queue_bind(nowait)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 541, in queue_bind
-     self.binding_arguments, nowait=nowait)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 551, in bind_to
-     nowait=nowait)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/channel.py", 
line 1003, in queue_bind
-     (50, 21),  # Channel.queue_bind_ok
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/abstract_channel.py",
 line 68, in wait
-     return self.dispatch_method(method_sig, args, content)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/abstract_channel.py",
 line 86, in dispatch_method
-     return amqp_method(self, args)
-   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/channel.py", 
line 241, in _close
-     reply_code, reply_text, (class_id, method_id), ChannelError,
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/excutils.py",
 line 62, in inner_func
+     return infunc(*args, **kwargs)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 741, in _consumer_thread
+ 
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 732, in consume
+     @excutils.forever_retry_uncaught_exceptions
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 660, in iterconsume
+     try:
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 590, in ensure
+     def close(self):
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 531, in reconnect
+     # to return an error not covered by its transport
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 513, in _connect
+     Will retry up to self.max_retries number of times.
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py",
 line 150, in reconnect
+     use the callback passed during __init__()
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 508, in declare
+     self.queue_bind(nowait)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 541, in queue_bind
+     self.binding_arguments, nowait=nowait)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/kombu/entity.py", 
line 551, in bind_to
+     nowait=nowait)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/channel.py", 
line 1003, in queue_bind
+     (50, 21),  # Channel.queue_bind_ok
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/abstract_channel.py",
 line 68, in wait
+     return self.dispatch_method(method_sig, args, content)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/abstract_channel.py",
 line 86, in dispatch_method
+     return amqp_method(self, args)
+   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/amqp/channel.py", 
line 241, in _close
+     reply_code, reply_text, (class_id, method_id), ChannelError,
  NotFound: Queue.bind: (404) NOT_FOUND - no exchange 
'reply_8f19344531b448c89d412ee97ff11e79' in vhost '/'
  
+ Than an RPC Timeout is raised each second in all the agents
  
- Than an RPC Timeout is raised each second in all the agents 
- 
- ERROR neutron.agent.l3_agent [-] Failed synchronizing routers 
+ ERROR neutron.agent.l3_agent [-] Failed synchronizing routers
  TRACE neutron.agent.l3_agent Traceback (most recent call last):
  TRACE neutron.agent.l3_agent   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/agent/l3_agent.py",
 line 702, in _rpc_loop
- TRACE neutron.agent.l3_agent     self.context, router_ids)    
+ TRACE neutron.agent.l3_agent     self.context, router_ids)
  TRACE neutron.agent.l3_agent   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/agent/l3_agent.py",
 line 79, in get_routers
  TRACE neutron.agent.l3_agent     topic=self.topic)
  TRACE neutron.agent.l3_agent   File 
"/opt/cloudbau/neutron-virtualenv/lib/python2.7/site-packages/neutron/openstack/common/rpc/proxy.py",
 line 130, in call
  TRACE neutron.agent.l3_agent     exc.info, real_topic, msg.get('method'))
  TRACE neutron.agent.l3_agent Timeout: Timeout while waiting on RPC response - 
topic: "q-l3-plugin", RPC method: "sync_routers" info: "<unknown>"
  
  This actually make the agent useless until they are all restarted.
  
  An analyze of what's going on coming soon :)
+ 
+ 
+ ---------------------------
+ 
+ [Impact]
+ 
+ This patch addresses an issue when a RabbitMQ cluster node goes down,
+ OpenStack services try to reconnect to another RabbitMQ node and then
+ re-create everything from scratch , and due to the 'auto-delete' flag is
+ set, race condition happened between re-create and delete on Exchange,
+ Queues, Bindings, which caused nova-compute and neutron agents are down.
+ 
+ [Test Case]
+ 
+ Note steps are for trusty-icehouse, including latest oslo.messaging
+ library (1.3.0-0ubuntu1.2 at the time of this writing).
+ 
+ Deploy an OpenStack cloud w/ multiple rabbit nodes and then abruptly
+ kill one of the rabbit nodes (e.g.  sudo service rabbitmq-server stop,
+ etc). Observe that the nova services and neutron agents do detect that
+ the node went down and report that they are reconnected, but messages
+ are still reporting as timed out, nova service-list/neutron agent-list
+ still reports compute and agents as down, etc.
+ 
+ [Regression Potential]
+ 
+ None.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1318721

Title:
  RPC timeout in all neutron agents

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1318721/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to