[ https://issues.apache.org/jira/browse/DISPATCH-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642461#comment-16642461 ]
Chuck Rolke commented on DISPATCH-1110: --------------------------------------- The title for this issue, "Intermittent router hang while running QIT's AMQP large content test", is a little misleading. The router never hangs. The hang is in the test Receiver which is waiting for 8 messages and only 7 messages arrive before the test times out. As designed, the test Sender sends 8 unacknowledged messages into the router network and then disconnects the AMQP connection after 2, 3, 4, or 5 messages are confirmed through the on_tracker_accept callback. What is the expectation for the remainder of the unconfirmed messages? I added instrumentation to the QIT Sender and Receiver to get timestamps and more internal information about test progress. In particular I added an 'on_tracker_release' callback. When this callback is invoked it means that for some reason the router could not send a message to its destination. During one of the Receiver hang events the Sender is being notified of a message being released: {{stderr= 1539028405.939654 on_sendable: sent 8 messages 1539028405.969454 on_sendable: doing nothing. Already sent 8 messages 1539028405.982049 on_sendable: doing nothing. Already sent 8 messages 1539028405.982106 on_tracker_release: msgsConfirmed 0 amqp_large_content_test::Sender::on_connection_error: amqp:session:invalid-field: sequencing error, expected delivery-id 5, got 4 1539028405.982304 on_container_stop: msgsConfirmed 0 amqp_large_content_test: Sender error: on_connection_error }} When the Sender is notified of a tracker release then the Receiver is guaranteed not to receive all the messages. > Intermittent router hang while running QIT's AMQP large content test > -------------------------------------------------------------------- > > Key: DISPATCH-1110 > URL: https://issues.apache.org/jira/browse/DISPATCH-1110 > Project: Qpid Dispatch > Issue Type: Bug > Environment: Standard QIT environment. > Once QIT is built and installed, the environment is set using the config.sh > file. See QUICKSTART for details. > Reporter: Kim van der Riet > Assignee: Ganesh Murthy > Priority: Major > Attachments: qdrouterd.conf > > > When running the Qpid Interop Test's AMQP large content test, a stand-alone > router will intermittently hang and cause the test to time out. > The failure appears to be limited to either the AMQP list or map types, and > usually with the C++ client as the message sender. The C++, Python2 and > Python3 as receiver clients have all seen this failure, but the Python2 > receiver client seems to reproduce more readily on my hardware. > In all cases, the test fails when the router sends what I suppose is the > final transfer of a large message (I have not added up/counted the bytes of > the many preceding transfers) to the consumer. The consumer then sends a > disposition, but the router does not respond again until the test times out. > The consumer can be seen to send heartbeats to the router, but the router > does not send any of its own. > {noformat} > ... (plenty of 65550-sized frames R->C) > R->C 5976 3.454766 ::1 ::1 AMQP 65550 > R->C 5977 3.454775 ::1 ::1 AMQP 65550 > R->C 5978 3.454783 ::1 ::1 AMQP 48171 > C->R 5982 3.529881 ::1 ::1 AMQP 115 disposition > C->R 5984 7.530704 ::1 ::1 AMQP 94 (empty) > C->R 5986 11.532306 ::1 ::1 AMQP 94 (empty) > ...{noformat} > There are no errors to be seen in the router logs other than when the > consuming client is killed owing to the test timeout. > {noformat} > ... > 2018-08-29 12:50:23.191754 -0400 SERVER (info) [14]: Accepted connection to > ::1:amqp from ::1:37262 > 2018-08-29 12:51:19.562695 -0400 SERVER (info) [14]: Connection from > ::1:37262 (to ::1:amqp) failed: amqp:connection:framing-error connection > aborted > {noformat} > The reproducer is not very tight on this, and the error occurs about 50% of > the time on my hardware. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org