Thanks anyways Xav, but that's not the problem I'm seeing here. This is on a fresh connection without any stateful firewall timeouts. If you do a send/receive on a topic that was used in a previous run, the second receive always times out. Odd. I'm digging into kombu and rabbit now.
-- Noel On Sun, Jul 6, 2014 at 1:03 AM, Xav Paice <[email protected]> wrote: > Just in case you haven't seen https://bugs.launchpad.net/nova/+bug/856764, > this behaviour sounds very much like it could be related. It's worth > reading the history since there's workarounds, and further explanation of > the nature of the problem. > > > > On 06/07/14 12:02, Noel Burton-Krahn wrote: > > Icehouse > oslo-messaging 1.3.0 > rabbitmq-server 3.1.3 > > We've noticed that nova rpc calls fail often after rabbit restarts. > I've tracked it down to oslo/rabbit/kombu timing out if it's forced to > reconnect to rabbit. The code below times out waiting for a reply if the > topic has been used in a previous run. The reply always arrives the first > time a topic is used, or if the topic is none. But, the second run with > the same topic will hang with this error: > > MessagingTimeout: Timed out waiting for a reply to message ID ... > > > This problem seems too basic to not be caught earlier in oslo, but the > program below does really reproduce the same symptoms we see in nova when > run against a live rabbit server. What's wrong with this picture? > > > Cheers > -- > Noel > > > #! /usr/bin/python > > from oslo.config import cfg > import threading > from oslo import messaging > import logging > import time > log = logging.getLogger(__name__) > > class OsloTest(): > def test(self): > # The code below times out waiting for a reply if the topic > # has been used in a previous run. The reply always arrives > # the first time a topic is used, or if the topic is none. > # But, the second run with the same topic will hang with this > # error: > # > # MessagingTimeout: Timed out waiting for a reply to message ID ... > # > topic = 'will_hang_on_second_usage' > #topic = None # never hangs > > url = "%(proto)s://%(user)s:%(password)s@%(host)s/" % dict( > proto = 'rabbit', > host = '1.2.3.4', > password = 'xxxxxxxx', > user = 'rabbit-mq-user', > ) > transport = messaging.get_transport(cfg.CONF, url) > driver = transport._driver > > target = messaging.Target(topic=topic) > listener = driver.listen(target) > ctxt={"context": True} > timeout = 10 > > def send_main(): > log.debug("sending msg") > reply = driver.send(target, > ctxt, > {'send': 1}, > wait_for_reply=True, > timeout=timeout) > > # times out if topic was not None and used before > log.debug("received reply=%r" % (reply,)) > > send_thread = threading.Thread(target=send_main) > send_thread.daemon = True > send_thread.start() > > msg = listener.poll() > log.debug("received msg=%r" % (msg,)) > > msg.reply({'reply': 1}) > > log.debug("sent reply") > > send_thread.join() > > if __name__ == '__main__': > FORMAT = '%(asctime)-15s %(process)5d %(thread)5d %(filename)s > %(funcName)s %(message)s' > logging.basicConfig(level=logging.DEBUG, format=FORMAT) > OsloTest().test() > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : [email protected] > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : [email protected] > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : [email protected] Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
