----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34560/#review84948 -----------------------------------------------------------
trunk/qpid/python/qpid/messaging/driver.py <https://reviews.apache.org/r/34560/#comment136405> The whole point of the reconnect is to allow the application to keep using the connection and associated sessions and links in the face of a network failure. - Gordon Sim On May 22, 2015, 4:52 p.m., Ernie Allen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/34560/ > ----------------------------------------------------------- > > (Updated May 22, 2015, 4:52 p.m.) > > > Review request for qpid, Alan Conway and Kenneth Giusti. > > > Repository: qpid > > > Description > ------- > > Calling receiver.fetch(timeout=10) in a loop, when network drops packages for > a while causes uncaught exception KeyError in python-qpid-0.22. It causes on > semi-infinite recursion on python-qpid-0.30. > > The recursion problem was solved independently. > > The attached patch does two things: > 1) session.close() checks to see if the session is already closed. If so, it > just returns. This prevents an exception from being displayed when the > session is already closed. > 2) In driver.py, if we get a do_session_detached() event, check to see if the > channel is in our list of sessions before using it. If it isn't, close the > session. > > Here is my estimation on what is happening when the network drops: > - The driver detects the socket error, closes the engine and goes into its > retry loop. > - Once the network comes back, the engine is restarted and all the sessions > on the connection are re-attached. > - However, the broker sees the attempt to attach using a channel that it > thinks is already attached. > - The broker logs the following: 2015-05-21 14:51:35 [Broker] error Channel > exception: session-busy: Session already attached: > anonymous.5c6f079c-571e-46f8-8ce6-72997da200a3:0 > (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/broker/SessionManager.cpp:55) > 2015-05-21 14:51:35 [Broker] error Channel exception: not-attached: Channel 0 > is not attached > (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:39) > - This results in a do_session_detached() event in the engine. > - However, since the engine was closed when the socket error occurred and > reopened when it cleared, it doesn't know about the old session. > > If I test to see if the channel number being detached is associated with a > session, and just return, then the client is hung. So.. when I see an event > to detach an unknown session, I'm closing the engine and raising a > ConnectionError back to the client. > > Ideally the driver/engine would recover, but I don't see how we can get the > broker and client back into agreement. > > > Diffs > ----- > > trunk/qpid/python/qpid/messaging/driver.py 1680941 > > Diff: https://reviews.apache.org/r/34560/diff/ > > > Testing > ------- > > 1. Run this script against a qpidd broker: > #!/usr/bin/env python > from qpid.messaging import * > import datetime > > conn = Connection("localhost:5672", reconnect=10) > timeout=10 > > try: > conn.open() > sess = conn.session() > > recv = sess.receiver("testQueue;{create:always}") > > while (1): > print "%s: before fetch, timeout=%s" %(datetime.datetime.now(), timeout) > msg = Message() > try: > msg = recv.fetch(timeout=timeout) > except ReceiverError, e: > print e > except ConnectError, e: > print "ConnectError", str(e) > break > print "%s: after fetch, msg=%s" (datetime.datetime.now(), msg) > > print "about to close session" > sess.close() > > except ReceiverError, e: > print e > except KeyboardInterrupt: > pass > > print "about to close connection" > conn.close() > > 2. Simulate network outage: > iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date > > 3. Once python script writes "No handlers could be found for logger > "qpid.messaging"", flush iptables (iptables -F) > > 4. Wait up to 10 seconds > > The ConnectError is received by the client and the loop can be exited. > > > Thanks, > > Ernie Allen > >
