[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580823#comment-14580823 ] ASF subversion and git services commented on QPID-6297: --- Commit 1684716 from [~eallen] in branch 'qpid/trunk' [ https://svn.apache.org/r1684716 ] QPID-6297: Python client should reconnect after network glitch Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Attachments: goferBug.cap Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 660, in write op.dispatch(self) File /usr/lib/python2.6/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 877, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): dictionary is empty' Expected results: 2nd TCP connection re-established, no errors in /var/log/messages -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272940#comment-14272940 ] Pavel Moravec commented on QPID-6297: - Backtrace in human-readable form: File /usr/lib/python2.7/site-packages/gofer/transport/qpid/consumer.py, line 116, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 1041, in fetch self._ecwait(lambda: not self.draining) File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.7/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.7/site-packages/qpid/messaging/driver.py, line 663, in write op.dispatch(self) File /usr/lib/python2.7/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.7/site-packages/qpid/messaging/driver.py, line 888, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 0 Potential cause: 1) client calls receiver.fetch with high timeout (here 10seconds) - no msg available, library waiting to broker or timeout 2) library detects connection drop, so it detaches the session (with traceback: [('/usr/lib64/python2.7/threading.py', 784, '__bootstrap', 'self.__bootstrap_inner()'), ('/usr/lib64/python2.7/threading.py', 811, '__bootstrap_inner', 'self.run()'), ('/usr/lib64/python2.7/threading.py', 764, 'run', 'self.__target(*self.__args, **self.__kwargs)'), ('/usr/lib/python2.7/site-packages/qpid/selector.py', 141, 'run', 'sel.readable()'), ('string', 6, 'readable', None), ('/usr/lib/python2.7/site-packages/qpid/messaging/driver.py', 422, 'readable', 'self.engine.write(data)'), ('/usr/lib/python2.7/site-packages/qpid/messaging/driver.py', 664, 'write', 'op.dispatch(self)'), ('/usr/lib/python2.7/site-packages/qpid/ops.py', 84, 'dispatch', 'getattr(target, handler)(self, *args)'), ('/usr/lib/python2.7/site-packages/qpid/messaging/driver.py', 886, 'do_session_detached', 'sss = removing dtc.channel= + str(dtc.channel) + \\n + str(traceback.extract_stack()) + \\n')] ) 3) on the fetch timeout, internal exception is raised about session detached, so the connection driver is asked for removing the session (while it has been removed) This should have trivial reproducer (I hope), something like: qpid-receive.py --timeout=10 -a testQueue; {create:always} -m10 and blocking iptables after a while (receiver should cycle) (will test it later on) Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Attachments: goferBug.cap Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273060#comment-14273060 ] Pavel Moravec commented on QPID-6297: - Trivial reproducer: 0) decrease TCP retries: echo 2 /proc/sys/net/ipv4/tcp_retries2 1) Run this script that runs receiver.fetch(timeout=10) in a loop: #!/usr/bin/env python from qpid.messaging import * import datetime conn = Connection(localhost:5672, reconnect=1) timeout=10 try: conn.open() sess = conn.session() recv = sess.receiver(testQueue;{create:always}) while (1): print %s: before fetch, timeout=%s %(datetime.datetime.now(), timeout) msg = Message() try: msg = recv.fetch(timeout=timeout) except ReceiverError, e: print e print %s: after fetch, msg=%s %(datetime.datetime.now(), msg) sess.close() except ReceiverError, e: print e except KeyboardInterrupt: pass conn.close() 2) simulate network outage: iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date 3) Once the script logs No handlers could be found for logger qpid.messaging, flush iptables 4) Wait few seconds for the backtrace Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Attachments: goferBug.cap Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 660, in write op.dispatch(self) File /usr/lib/python2.6/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 877, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): dictionary is empty' Expected results: 2nd TCP connection re-established, no errors in /var/log/messages -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273068#comment-14273068 ] Pavel Moravec commented on QPID-6297: - That was test on qpid 0.22. Upstream qpid does not work much better: Traceback (most recent call last): File /root/Python_MRG/test_gofer_like.py, line 18, in module msg = recv.fetch(timeout=timeout) File string, line 6, in fetch File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 1067, in fetch self._ecwait(lambda: not self.draining) File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 1019, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 595, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 234, in _ewait self.check_error() File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 226, in check_error self.close() File string, line 6, in close File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 345, in close ssn.close(timeout=timeout) File string, line 6, in close File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 777, in close self.sync(timeout=timeout) File string, line 6, in sync File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 768, in sync if not self._ewait(lambda: not self.outgoing and not self.acked, timeout=timeout): File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 595, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 234, in _ewait self.check_error() File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 226, in check_error self.close() File string, line 6, in close File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 345, in close ssn.close(timeout=timeout) File string, line 6, in close File /data_xfs/qpid/cpp/BLD/src/tests/python/qpid/messaging/endpoints.py, line 777, in close self.sync(timeout=timeout) .. RuntimeError: maximum recursion depth exceeded Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Attachments: goferBug.cap Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result =
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269405#comment-14269405 ] Gordon Sim commented on QPID-6297: -- Heartbeats would perhaps cause the connection to be aborted before whatever it is that causes this error goes wrong, but the cause of this error isn't clear. If TCP is doing what I think it should be, dropping packages should not corrupt the stream (either by skipping parts or by duplicating parts). What else might cause such an exception? It does seem like an unanticipated internal state inconsistency within the client (whether due to bad stream or some other input). If it is easy to reproduce it would be interesting to get a wireshark trace and/or debug logs of the bytes the python client believes it is being sent. Does the application use the same session from multiple threads? Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 660, in write op.dispatch(self) File /usr/lib/python2.6/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 877, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): dictionary is empty' Expected results: 2nd TCP connection re-established, no errors in /var/log/messages -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14269433#comment-14269433 ] Jeff Ortel commented on QPID-6297: -- No, gofer is very careful to not share any messaging library objects between threads. Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 660, in write op.dispatch(self) File /usr/lib/python2.6/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 877, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): dictionary is empty' Expected results: 2nd TCP connection re-established, no errors in /var/log/messages -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org
[jira] [Commented] (QPID-6297) Python client (qpid.messaging) raises KeyError insead of reconnecting
[ https://issues.apache.org/jira/browse/QPID-6297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268935#comment-14268935 ] Pavel Moravec commented on QPID-6297: - Why the problem must be in python library and not in goferd? Because the client raises unhandled exception? Isn't using heartbeats a workaround? Python client (qpid.messaging) raises KeyError insead of reconnecting - Key: QPID-6297 URL: https://issues.apache.org/jira/browse/QPID-6297 Project: Qpid Issue Type: Bug Components: Python Client Affects Versions: 0.22 Environment: EL6 Reporter: Jeff Ortel Description of problem: Having some temporary network outage causing gofer loses TCP connection to AMQP broker, it does not try to reconnect. How reproducible: 100% Steps to Reproduce: 1. Just to speedup reproducer, lower kernel tunable net.ipv4.tcp_retries2 to e.g. 4: echo 4 /proc/sys/net/ipv4/tcp_retries2 2. Have consumer connected (with auto-reconnect enabled and heartbeats not enabled) and receiver open on a queue address and check its TCP connections to AMQP broker: netstat -anp | grep 5671 (there should be 2 TCP connections) 3. Emulate network outage via iptables: iptables -A OUTPUT -p tcp --dport 5671 -j REJECT 4. Monitor /var/log/messages; once it logs WARNING recoverable error, flush iptables (iptables -F). 5. Wait few seconds. 6. Check gofer TCP connections: netstat -anp | grep 5671 Actual results: 6. shows just 1 TCP connection /var/log/messages repeatedly logs: Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - 046d2084-b0f1-4de4-a039-89499d9e680d Dec 1 16:39:02 pmoravec-rhel6-3 goferd: [ERROR][pulp.agent.a726580c-5f1e-4a79-9f11-de0adc52c1e9] gofer.transport.qpid.consumer:117 - Traceback (most recent call last): File /usr/lib/python2.6/site-packages/gofer/transport/qpid/consumer.py, line 113, in get return self.__receiver.fetch(timeout=timeout) File string, line 6, in fetch File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 1030, in fetch self._ecwait(lambda: self.linked) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 50, in _ecwait result = self._ewait(lambda: self.closed or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 993, in _ewait result = self.session._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 580, in _ewait result = self.connection._ewait(lambda: self.error or predicate(), timeout) File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 219, in _ewait self.check_error() File /usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py, line 212, in check_error raise e InternalError: Traceback (most recent call last): File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 660, in write op.dispatch(self) File /usr/lib/python2.6/site-packages/qpid/ops.py, line 84, in dispatch getattr(target, handler)(self, *args) File /usr/lib/python2.6/site-packages/qpid/messaging/driver.py, line 877, in do_session_detached sst = self._sessions.pop(dtc.channel) KeyError: 'pop(): dictionary is empty' Expected results: 2nd TCP connection re-established, no errors in /var/log/messages -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org