----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34560/#review87025 -----------------------------------------------------------
What's there will work for the case you are trying to fix, where the original connection has failed. There is a missing piece however. The spec says of "force": "If set then a busy session will be forcibly detached from its other transport and reattached to the current transport. A session MUST NOT be attached to more than one transport at a time." So there willl be a problem if the original session has NOT failed and another connection forces one of its sessions. In that case we will have 2 connections with the same named session. The SessionManager is not designed to handle this, it does not have any way to find the connection owns the original session to detach it. I'll have a think and send you a sketch of a fix. - Alan Conway On June 5, 2015, 2:46 p.m., Ernie Allen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/34560/ > ----------------------------------------------------------- > > (Updated June 5, 2015, 2:46 p.m.) > > > Review request for qpid, Alan Conway and Kenneth Giusti. > > > Repository: qpid > > > Description > ------- > > Calling receiver.fetch(timeout=10) in a loop, when network drops packages for > a while causes uncaught exception KeyError in python-qpid-0.22. It causes on > semi-infinite recursion on python-qpid-0.30. > > The recursion problem was solved independently. > > The attached patch does two things: > 1) session.close() checks to see if the session is already closed. If so, it > just returns. This prevents an exception from being displayed when the > session is already closed. > 2) In driver.py, if we get a do_session_detached() event, check to see if the > channel is in our list of sessions before using it. If it isn't, close the > session. > > Here is my estimation on what is happening when the network drops: > - The driver detects the socket error, closes the engine and goes into its > retry loop. > - Once the network comes back, the engine is restarted and all the sessions > on the connection are re-attached. > - However, the broker sees the attempt to attach using a channel that it > thinks is already attached. > - The broker logs the following: 2015-05-21 14:51:35 [Broker] error Channel > exception: session-busy: Session already attached: > anonymous.5c6f079c-571e-46f8-8ce6-72997da200a3:0 > (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/broker/SessionManager.cpp:55) > 2015-05-21 14:51:35 [Broker] error Channel exception: not-attached: Channel 0 > is not attached > (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:39) > - This results in a do_session_detached() event in the engine. > - However, since the engine was closed when the socket error occurred and > reopened when it cleared, it doesn't know about the old session. > > If I test to see if the channel number being detached is associated with a > session, and just return, then the client is hung. So.. when I see an event > to detach an unknown session, I'm closing the engine and raising a > ConnectionError back to the client. > > Ideally the driver/engine would recover, but I don't see how we can get the > broker and client back into agreement. > > > Diffs > ----- > > trunk/qpid/cpp/src/qpid/broker/SessionManager.cpp 1680941 > trunk/qpid/python/qpid/messaging/driver.py 1680941 > > Diff: https://reviews.apache.org/r/34560/diff/ > > > Testing > ------- > > 1. Run this script against a qpidd broker: > #!/usr/bin/env python > from qpid.messaging import * > import datetime > > conn = Connection("localhost:5672", reconnect=10) > timeout=10 > > try: > conn.open() > sess = conn.session() > > recv = sess.receiver("testQueue;{create:always}") > > while (1): > print "%s: before fetch, timeout=%s" %(datetime.datetime.now(), timeout) > msg = Message() > try: > msg = recv.fetch(timeout=timeout) > except ReceiverError, e: > print e > except ConnectError, e: > print "ConnectError", str(e) > break > print "%s: after fetch, msg=%s" (datetime.datetime.now(), msg) > > print "about to close session" > sess.close() > > except ReceiverError, e: > print e > except KeyboardInterrupt: > pass > > print "about to close connection" > conn.close() > > 2. Simulate network outage: > iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date > > 3. Once python script writes "No handlers could be found for logger > "qpid.messaging"", flush iptables (iptables -F) > > 4. Wait up to 10 seconds > > The ConnectError is received by the client and the loop can be exited. > > > Thanks, > > Ernie Allen > >
