> On May 21, 2015, 8:51 p.m., Alan Conway wrote:
> > There is something more going on here. If the client is re-connecting then 
> > it should be re-connecting on an entirely new connection so sessions should 
> > not be able to clash, the new connection should have no session on it.
> > 
> > It sounds to me like somehow the client is re-using the original connection 
> > after you unblock it at the firewall, which is definitely not right - there 
> > could be all kinds of invalid state in that connections sessions. If the 
> > client decides a connection is faulty it should definitively close it and 
> > forget it before re-connecting and re-establishing sessions. You need to 
> > track down how/why it is managing to use the old connection after it has 
> > failed.
> 
> Gordon Sim wrote:
>     'If the client is re-connecting then it should be re-connecting on an 
> entirely new connection so sessions should not be able to clash' - this is 
> true in the sense that it is a new connection as far as the broker is 
> concerned, however the client will keep using the qpid.message.Connection and 
> will reattach all the corresponding sessions. If the broker hasn't determined 
> that the old connection is now dead, it won't have closed the old sessions, 
> which could indeed result in a naming clash as the text above hypothesises.

Ah yes - because 0-10 session names are not scoped to the connection, I had 
forgotten that. So we either have to rename the session on the client or force 
the broker to allow the new session. 0-10 has a "force" flag on attach for 
exactly that but looking at the broker/SessionManager code I see "// FIXME 
aconway 2008-04-29: implement force " :( It probably wouldn't be hard to 
implement though.


- Alan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34560/#review84786
-----------------------------------------------------------


On May 22, 2015, 4:52 p.m., Ernie Allen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34560/
> -----------------------------------------------------------
> 
> (Updated May 22, 2015, 4:52 p.m.)
> 
> 
> Review request for qpid, Alan Conway and Kenneth Giusti.
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> Calling receiver.fetch(timeout=10) in a loop, when network drops packages for 
> a while causes uncaught exception KeyError in python-qpid-0.22. It causes on 
> semi-infinite recursion on python-qpid-0.30.
> 
> The recursion problem was solved independently.
> 
> The attached patch does two things:
> 1) session.close() checks to see if the session is already closed. If so, it 
> just returns. This prevents an exception from being displayed when the 
> session is already closed.
> 2) In driver.py, if we get a do_session_detached() event, check to see if the 
> channel is in our list of sessions before using it. If it isn't, close the 
> session.
> 
> Here is my estimation on what is happening when the network drops:
> - The driver detects the socket error, closes the engine and goes into its 
> retry loop.
> - Once the network comes back, the engine is restarted and all the sessions 
> on the connection are re-attached.
> - However, the broker sees the attempt to attach using a channel that it 
> thinks is already attached.
> - The broker logs the following: 2015-05-21 14:51:35 [Broker] error Channel 
> exception: session-busy: Session already attached: 
> anonymous.5c6f079c-571e-46f8-8ce6-72997da200a3:0 
> (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/broker/SessionManager.cpp:55)
> 2015-05-21 14:51:35 [Broker] error Channel exception: not-attached: Channel 0 
> is not attached 
> (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:39)
> - This results in a do_session_detached() event in the engine.
> - However, since the engine was closed when the socket error occurred and 
> reopened when it cleared, it doesn't know about the old session.
> 
> If I test to see if the channel number being detached is associated with a 
> session, and just return, then the client is hung. So.. when I see an event 
> to detach an unknown session, I'm closing the engine and raising a 
> ConnectionError back to the client.
> 
> Ideally the driver/engine would recover, but I don't see how we can get the 
> broker and client back into agreement.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/python/qpid/messaging/driver.py 1680941 
> 
> Diff: https://reviews.apache.org/r/34560/diff/
> 
> 
> Testing
> -------
> 
> 1. Run this script against a qpidd broker:
> #!/usr/bin/env python
> from qpid.messaging import *
> import datetime
> 
> conn = Connection("localhost:5672", reconnect=10)
> timeout=10
> 
> try:
>   conn.open()
>   sess = conn.session()
> 
>   recv = sess.receiver("testQueue;{create:always}")
>   
>   while (1):
>     print "%s: before fetch, timeout=%s" %(datetime.datetime.now(), timeout)
>     msg = Message()
>     try:
>       msg = recv.fetch(timeout=timeout)
>     except ReceiverError, e:
>       print e
>     except ConnectError, e:
>       print "ConnectError", str(e)
>       break
>     print "%s: after fetch, msg=%s"  (datetime.datetime.now(), msg)
> 
>   print "about to close session"
>   sess.close()
> 
> except ReceiverError, e:
>   print e
> except KeyboardInterrupt:
>   pass
> 
> print "about to close connection"
> conn.close()
> 
> 2. Simulate network outage:
> iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date
> 
> 3. Once python script writes "No handlers could be found for logger 
> "qpid.messaging"", flush iptables (iptables -F)
> 
> 4. Wait up to 10 seconds
> 
> The ConnectError is received by the client and the loop can be exited.
> 
> 
> Thanks,
> 
> Ernie Allen
> 
>

Reply via email to