[ 
https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14269405#comment-14269405
 ] 

Gordon Sim commented on QPID-6297:
----------------------------------

Heartbeats would perhaps cause the connection to be aborted before whatever it 
is that causes this error goes wrong, but the cause of this error isn't clear. 
If TCP is doing what I think it should be, dropping packages should not corrupt 
the stream (either by skipping parts or by duplicating parts). What else might 
cause such an exception? It does seem like an unanticipated internal state 
inconsistency within the client (whether due to bad stream or some other input).

If it is easy to reproduce it would be interesting to get a wireshark trace 
and/or debug logs of the bytes the python client believes it is being sent.

Does the application use the same session from multiple threads?



> Python client (qpid.messaging) raises KeyError insead of reconnecting
> ---------------------------------------------------------------------
>
>                 Key: QPID-6297
>                 URL: https://issues.apache.org/jira/browse/QPID-6297
>             Project: Qpid
>          Issue Type: Bug
>          Components: Python Client
>    Affects Versions: 0.22
>         Environment: EL6
>            Reporter: Jeff Ortel
>
> Description of problem:
> Having some temporary network outage causing gofer loses TCP connection to 
> AMQP broker, it does not try to reconnect.
> How reproducible:
> 100%
> Steps to Reproduce:
> 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to 
> e.g. 4:
> echo 4 > /proc/sys/net/ipv4/tcp_retries2
> 2. Have consumer connected (with auto-reconnect enabled and heartbeats not 
> enabled) and receiver open on a queue address and check its TCP connections 
> to AMQP broker:
> netstat -anp | grep 5671
> (there should be 2 TCP connections)
> 3. Emulate network outage via iptables:
> iptables -A OUTPUT -p tcp --dport 5671 -j REJECT
> 4. Monitor /var/log/messages; once it logs WARNING "recoverable error", flush 
> iptables (iptables -F).
> 5. Wait few seconds.
> 6. Check gofer TCP connections:
> netstat -anp | grep 5671
> Actual results:
> 6. shows just 1 TCP connection
> /var/log/messages repeatedly logs:
> Dec  1 16:39:02 pmoravec-rhel6-3 goferd: 
> [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] 
> gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d
> Dec  1 16:39:02 pmoravec-rhel6-3 goferd: 
> [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] 
> gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File 
> "/usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py", line 
> 113, in get return self.__receiver.fetch(timeout=timeout) File "<string>", 
> line 6, in fetch File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 1030, in 
> fetch self._ecwait(lambda: self.linked) File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 50, in 
> _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) 
> File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 
> 993, in _ewait result = self.session._ewait(lambda: self.error or 
> predicate(), timeout) File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 580, in 
> _ewait result = self.connection._ewait(lambda: self.error or predicate(), 
> timeout) File "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", 
> line 219, in _ewait self.check_error() File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 212, in 
> check_error raise e InternalError: Traceback (most recent call last): File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/driver.py", line 660, in 
> write op.dispatch(self) File "/usr/lib/python2.6/site-packages/qpid/ops.py", 
> line 84, in dispatch getattr(target, handler)(self, *args) File 
> "/usr/lib/python2.6/site-packages/qpid/messaging/driver.py", line 877, in 
> do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): 
> dictionary is empty'
> Expected results:
> 2nd TCP connection re-established, no errors in /var/log/messages



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to