[ https://issues.apache.org/jira/browse/DISPATCH-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640371#comment-16640371 ]
Chuck Rolke commented on DISPATCH-1110: --------------------------------------- Thanks Robbie. That looks very similar. I have observed that the QIT send test sends eight messages and then closes the connection. I used PN_TRACE_EVT on both the QIT Sender and the router and see the same pattern: the Sender client closes the connection immediately after sending and after only 2, 3, or 4 accept dispositions have been received. A typical Sender trace (with some extra application printf commentary) is appended. In the QIT test I don't see how to make the sender wait around for all the messages to be accepted before closing the connection. > PN_TRACE_EVT=1 > /opt/local/libexec/qpid_interop_test/shims/qpid-proton-cpp/amqp_large_content_test/Sender > [::1]:5672 qit.amqp_large_content_test.map.ProtonCpp.ProtonCpp map '[[1, [1, > 16, 256, 4096]], [10, [1, 16, 256, 4096]]]' [0x22f3560]:(PN_CONNECTION_INIT, pn_connection<0x22ee070>) [0x22f3560]:(PN_SESSION_INIT, pn_session<0x22ef600>) [0x22f3560]:(PN_LINK_INIT, pn_link<0x22f0a70>) [0x22f3560]:(PN_CONNECTION_BOUND, pn_connection<0x22ee070>) [0x22f3560]:(PN_CONNECTION_REMOTE_OPEN, pn_connection<0x22ee070>) [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_SESSION_REMOTE_OPEN, pn_session<0x22ef600>) [0x22f3560]:(PN_LINK_REMOTE_OPEN, pn_link<0x22f0a70>) [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: sent 8 messages [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: doing nothing. Already sent 8 messages [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: doing nothing. Already sent 8 messages [0x22f3560]:(PN_DELIVERY, pn_delivery<0x2304440>{sending, tag=b"\x01\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted}) on_tracker_accept: msgsConfirmed 1 [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: doing nothing. Already sent 8 messages [0x22f3560]:(PN_DELIVERY, pn_delivery<0x2325490>{sending, tag=b"\x02\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted}) on_tracker_accept: msgsConfirmed 2 [0x22f3560]:(PN_CONNECTION_LOCAL_CLOSE, pn_connection<0x22ee070>) <-- QIT Sender closes connection [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: doing nothing. Already sent 8 messages [0x22f3560]:(PN_DELIVERY, pn_delivery<0x2301e70>{sending, tag=b"\x03\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted}) on_tracker_accept: msgsConfirmed 3 [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_TRANSPORT_HEAD_CLOSED, pn_transport<0x22f3560>) [0x22f3560]:(PN_LINK_FLOW, pn_link<0x22f0a70>) on_sendable: doing nothing. Already sent 8 messages [0x22f3560]:(PN_DELIVERY, pn_delivery<0x252aa60>{sending, tag=b"\x04\x00\x00\x00\x00\x00\x00\x00", local=unknown, remote=accepted}) on_tracker_accept: msgsConfirmed 4 [0x22f3560]:(PN_TRANSPORT, pn_transport<0x22f3560>) [0x22f3560]:(PN_CONNECTION_REMOTE_CLOSE, pn_connection<0x22ee070>) [0x22f3560]:(PN_TRANSPORT_TAIL_CLOSED, pn_transport<0x22f3560>) [0x22f3560]:(PN_TRANSPORT_CLOSED, pn_transport<0x22f3560>) on_transport_close sent: 8 , confirmed: 4 on_container_stop: msgsConfirmed 4 amqp_large_content_test Sender container.run() exited > Intermittent router hang while running QIT's AMQP large content test > -------------------------------------------------------------------- > > Key: DISPATCH-1110 > URL: https://issues.apache.org/jira/browse/DISPATCH-1110 > Project: Qpid Dispatch > Issue Type: Bug > Environment: Standard QIT environment. > Once QIT is built and installed, the environment is set using the config.sh > file. See QUICKSTART for details. > Reporter: Kim van der Riet > Assignee: Ganesh Murthy > Priority: Major > Attachments: qdrouterd.conf > > > When running the Qpid Interop Test's AMQP large content test, a stand-alone > router will intermittently hang and cause the test to time out. > The failure appears to be limited to either the AMQP list or map types, and > usually with the C++ client as the message sender. The C++, Python2 and > Python3 as receiver clients have all seen this failure, but the Python2 > receiver client seems to reproduce more readily on my hardware. > In all cases, the test fails when the router sends what I suppose is the > final transfer of a large message (I have not added up/counted the bytes of > the many preceding transfers) to the consumer. The consumer then sends a > disposition, but the router does not respond again until the test times out. > The consumer can be seen to send heartbeats to the router, but the router > does not send any of its own. > {noformat} > ... (plenty of 65550-sized frames R->C) > R->C 5976 3.454766 ::1 ::1 AMQP 65550 > R->C 5977 3.454775 ::1 ::1 AMQP 65550 > R->C 5978 3.454783 ::1 ::1 AMQP 48171 > C->R 5982 3.529881 ::1 ::1 AMQP 115 disposition > C->R 5984 7.530704 ::1 ::1 AMQP 94 (empty) > C->R 5986 11.532306 ::1 ::1 AMQP 94 (empty) > ...{noformat} > There are no errors to be seen in the router logs other than when the > consuming client is killed owing to the test timeout. > {noformat} > ... > 2018-08-29 12:50:23.191754 -0400 SERVER (info) [14]: Accepted connection to > ::1:amqp from ::1:37262 > 2018-08-29 12:51:19.562695 -0400 SERVER (info) [14]: Connection from > ::1:37262 (to ::1:amqp) failed: amqp:connection:framing-error connection > aborted > {noformat} > The reproducer is not very tight on this, and the error occurs about 50% of > the time on my hardware. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org