[
https://issues.apache.org/jira/browse/DISPATCH-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270783#comment-17270783
]
Charles E. Rolke commented on DISPATCH-1927:
--------------------------------------------
The test setup was modified with
* more logging. See
[https://github.com/ChugR/qpid-dispatch/tree/DISPATCH-1927]. The logs focus on
when items are added and deleted from a link's undelivered list and when
delivery ownership is handed off between links.
* Router running B.conf is modified to use log DEFAULT=trace+
* Test load reduced to just three processes opening and closing TCP adaptor
ports and never sending any traffic.
Within a few hundred connections the router fails with the log excerpts shown
here.
* Delivery 977 is the fail case and shows only abut 20 lines of log.
* Delivery 974 is a success case and shows twice as much log.
* The difference is when a loop in transfer.c titled "DISPATCH-1302 race hack
fix" exits
+ If the loop runs and exits before the results of calling deliver_handler
take effect then everything works
+ If the effects of deliver_handler call kick in before the loop exits then
the router crashes
The delivery cadence appears to be four deliveries per connection. 977 fails
but 973, 969, 965, and so on are successful.
{code:java}
This is the failure pattern:
chug@unused D-1777-tcp-echo> grep D977 8.log
2021-01-23 18:09:13.666524 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
created qdr_forward_new_delivery_CT
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:159)
2021-01-23 18:09:13.666538 -0500 ROUTER_CORE (trace) [C2][L53][D976] :in
qdr_delivery_link_peers_CT out: [C1][L1][D977]
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:509)
2021-01-23 18:09:13.666549 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
incref: rc:1 qdr_delivery_link_peers_CT - linked to peer (out delivery)
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.666570 -0500 ROUTER_CORE (debug) [C1][L1][D977] DLV++
qdr_forward_deliver_CT delivery added to undelivered. link [L1]
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:264)
2021-01-23 18:09:13.666581 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
incref: rc:2 qdr_forward_deliver_CT - add to undelivered list
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.666699 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
incref: rc:3 qdr_link_process_deliveries - holding the undelivered delivery
locally (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.666713 -0500 ROUTER_CORE (debug) [C1][L1][D977]
process_deliveries loop settled=False, link [L1]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:178)
2021-01-23 18:09:13.666726 -0500 TCP_ADAPTOR (debug) [C1][L1][D977]
qdr_tcp_deliver Delivery event
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:1070)
2021-01-23 18:09:13.666737 -0500 TCP_ADAPTOR (debug) [C1][L1][D977] tcp_adaptor
initiating egress connection
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:1072)
2021-01-23 18:09:13.666789 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
incref: rc:4 qdr_tcp_connection_egress - held initial delivery
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.667098 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
incref: rc:5 qdr_link_first_attach - protect delivery in action list
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.667109 -0500 TCP_ADAPTOR (debug) [C1][L1][D977]
initial_delivery ownership passed to [C301][L624][D977]
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:674)
2021-01-23 18:09:13.667121 -0500 ROUTER_CORE (debug) [C1][L1][D977] Delivery
decref: rc:4 tcp-adaptor - passing initial_delivery into new link
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.667268 -0500 ROUTER_CORE (debug) [C1][L1][D977]
process_deliveries loop exit: new_disp is TO_NEW_LINK
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:188)
2021-01-23 18:09:13.667287 -0500 ROUTER_CORE (debug) [C1][L1][D977] DLV
MOVED_TO_NEW_LINK from [C1][L1]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:232)
2021-01-23 18:09:13.667303 -0500 ROUTER_CORE (debug) [C1][L1][D977] DLV--
qdr_link_process_initial_delivery_CT delivery removed from undelivered. link
[L624], old_link [L1]
(/home/chug/git/qpid-dispatch/src/router_core/connections.c:1626)
2021-01-23 18:09:13.667330 -0500 ROUTER_CORE (debug) [C1][L1][D977] DLV++
qdr_forward_deliver_CT delivery added to undelivered. link [L624]
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:264)
This is the success pattern
chug@unused D-1777-tcp-echo> grep D973 8.log
2021-01-23 18:09:13.663836 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
created qdr_forward_new_delivery_CT
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:159)
2021-01-23 18:09:13.663849 -0500 ROUTER_CORE (trace) [C2][L24][D972] :in
qdr_delivery_link_peers_CT out: [C1][L1][D973]
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:509)
2021-01-23 18:09:13.663860 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:1 qdr_delivery_link_peers_CT - linked to peer (out delivery)
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.663883 -0500 ROUTER_CORE (debug) [C1][L1][D973] DLV++
qdr_forward_deliver_CT delivery added to undelivered. link [L1]
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:264)
2021-01-23 18:09:13.663895 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:2 qdr_forward_deliver_CT - add to undelivered list
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664003 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:3 qdr_link_process_deliveries - holding the undelivered delivery
locally (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664017 -0500 ROUTER_CORE (debug) [C1][L1][D973]
process_deliveries loop settled=False, link [L1]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:178)
2021-01-23 18:09:13.664030 -0500 TCP_ADAPTOR (debug) [C1][L1][D973]
qdr_tcp_deliver Delivery event
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:1070)
2021-01-23 18:09:13.664042 -0500 TCP_ADAPTOR (debug) [C1][L1][D973] tcp_adaptor
initiating egress connection
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:1072)
2021-01-23 18:09:13.664057 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:4 qdr_tcp_connection_egress - held initial delivery
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664304 -0500 ROUTER_CORE (debug) [C1][L1][D973]
process_deliveries loop exit: new_disp is TO_NEW_LINK
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:188)
2021-01-23 18:09:13.664326 -0500 ROUTER_CORE (debug) [C1][L1][D973] DLV
MOVED_TO_NEW_LINK from [C1][L1]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:232)
2021-01-23 18:09:13.664340 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
decref: rc:3 qdr_link_process_deliveries - moved from undelivered list to
some other link (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.664353 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
decref: rc:2 qdr_link_process_deliveries - release local reference - done
processing (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.664534 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:3 qdr_link_first_attach - protect delivery in action list
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664547 -0500 TCP_ADAPTOR (debug) [C1][L1][D973]
initial_delivery ownership passed to [C300][L622][D973]
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:674)
2021-01-23 18:09:13.664558 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
decref: rc:2 tcp-adaptor - passing initial_delivery into new link
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.664676 -0500 ROUTER_CORE (debug) [C1][L1][D973] DLV++
qdr_forward_deliver_CT delivery added to undelivered. link [L622]
(/home/chug/git/qpid-dispatch/src/router_core/forwarder.c:264)
2021-01-23 18:09:13.664688 -0500 ROUTER_CORE (debug) [C1][L1][D973] Delivery
incref: rc:3 qdr_forward_deliver_CT - add to undelivered list
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664720 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery decref: rc:2 qdr_link_process_initial_delivery_CT
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.664959 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery incref: rc:3 qdr_link_process_deliveries - holding the undelivered
delivery locally (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.664977 -0500 ROUTER_CORE (debug) [C300][L622][D973]
process_deliveries loop settled=False, link [L622]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:178)
2021-01-23 18:09:13.664995 -0500 TCP_ADAPTOR (debug) [C300][L622][D973]
qdr_tcp_deliver Delivery event
(/home/chug/git/qpid-dispatch/src/adaptors/tcp_adaptor.c:1070)
2021-01-23 18:09:13.665015 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery incref: rc:4 tcp_adaptor - new outstream
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:113)
2021-01-23 18:09:13.665221 -0500 ROUTER_CORE (debug) [C300][L622][D973] DLV--
qdr_link_process_deliveries delivery removed from undelivered. link [L622]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:216)
2021-01-23 18:09:13.665235 -0500 ROUTER_CORE (debug) [C300][L622][D973] DLV--
Delivery transfer: qdr_link_process_deliveries: undelivered-list ->
unsettled-list. link [L622]
(/home/chug/git/qpid-dispatch/src/router_core/transfer.c:227)
2021-01-23 18:09:13.665249 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery decref: rc:3 qdr_link_process_deliveries - release local reference
- done processing (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.665806 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery decref: rc:2 tcp-adaptor.handle_disconnected - outstream
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:147)
2021-01-23 18:09:13.666258 -0500 ROUTER_CORE (trace) [C300][L622][D973] :in
qdr_delivery_unlink_peers_CT out: [C2][L24][D972]
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:544)
2021-01-23 18:09:13.666283 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery decref_CT: rc:1 qdr_delivery_unlink_peers_CT - unlinked from peer
(delivery) (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:642)
2021-01-23 18:09:13.666314 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery outcome : is unknown (0x0)
(/home/chug/git/qpid-dispatch/src/router_core/delivery.c:393)
2021-01-23 18:09:13.666329 -0500 ROUTER_CORE (debug) [C300][L622][D973]
Delivery decref_CT: rc:0 qdr_link_cleanup_deliveries_CT - remove from unsettled
list (/home/chug/git/qpid-dispatch/src/router_core/delivery.c:642)
{code}
> TCP adaptor Assertion `(link->undelivered).head' failed.
> --------------------------------------------------------
>
> Key: DISPATCH-1927
> URL: https://issues.apache.org/jira/browse/DISPATCH-1927
> Project: Qpid Dispatch
> Issue Type: Bug
> Components: Protocol Adaptors, Router Node
> Environment: Fedora 31
> Reporter: Charles E. Rolke
> Priority: Major
>
> Test setup:
> * Proton git: branch master @ 5e7d7af8f
> * Dispatch git: branch master @ ec39e5e
> * Debug build
> * Running the configuration described in DISPATCH-1777
> Seven clients run TCP_echo_client continuously against the server:
> * Four run --size 0
> * One runs --size 100
> * One runs --size 100000
> * One runs --size 1000000
> Eventually the router running the tcpConnector fails with the assertion.
> Tail of the router log:
> {code:java}
> Two router setup with a TCP listener on one router and a TCP connector on the
> other router.
> This is the log from the second tcpConnector router
> Here comes a message from core to TCP needing a new connection to egress
> server
> 2021-01-22 09:37:57.979054 -0500 TCP_ADAPTOR (debug) [C1] on_activate
> 2021-01-22 09:37:57.979594 -0500 TCP_ADAPTOR (debug) [C1][L1] qdr_tcp_push
> 2021-01-22 09:37:57.979619 -0500 TCP_ADAPTOR (debug) [C1][L1][D46793]
> qdr_tcp_deliver Delivery event
> 2021-01-22 09:37:57.979641 -0500 TCP_ADAPTOR (debug) [C1][L1][D46793]
> tcp_adaptor initiating egress connection
> 2021-01-22 09:37:57.979671 -0500 TCP_ADAPTOR (info) [C11768] Connecting to:
> 127.0.0.1:9090
> 2021-01-22 09:37:57.979789 -0500 TCP_ADAPTOR (debug) [C11767]
> qdr_tcp_activate: waking raw connection
> 2021-01-22 09:37:57.979974 -0500 TCP_ADAPTOR (info) [C11768]
> PN_RAW_CONNECTION_CONNECTED Egress connected to 127.0.0.1:9090
> 2021-01-22 09:37:57.980003 -0500 TCP_ADAPTOR (info) [C11768] Opening
> server-side core connection 127.0.0.1:9090
> 2021-01-22 09:37:57.980057 -0500 ROUTER_CORE (info) [C11768] Connection
> Opened: dir=out host=127.0.0.1:9090 vhost= encrypted=no auth=no user=
> container_id=TcpAdaptor props=
> Delivery is handed off to new connection/link
> 2021-01-22 09:37:57.980153 -0500 TCP_ADAPTOR (debug) [C1][L1][D46793]
> initial_delivery ownership passed to [C11768][L23564][D46793]
> 2021-01-22 09:37:57.980191 -0500 TCP_ADAPTOR (debug) [C11768]
> PN_RAW_CONNECTION_NEED_WRITE_BUFFERS
> 2021-01-22 09:37:57.980215 -0500 TCP_ADAPTOR (debug) [C11768]
> PN_RAW_CONNECTION_NEED_READ_BUFFERS
> 2021-01-22 09:37:57.980243 -0500 TCP_ADAPTOR (debug) [C11768][L23564] Waiting
> for credit to initiate message
> POOF! link_process_delivery has no deliveries yet.
> This is happening too soon. Probably.
> qdrouterd: /home/chug/git/qpid-dispatch/src/router_core/transfer.c:231:
> qdr_link_process_deliveries: Assertion `(link->undelivered).head' failed.
> from gdb: delivery_id = 46793, link_id = 23564, conn_id = 11768 or:
> [C11768][L23564][D46793] for failed process
> Moments later the rest of the connection/link setup happens
> 2021-01-22 09:37:57.980312 -0500 ROUTER_CORE (info) [C11768][L23564] Link
> attached: dir=out source={foo expire:link} target={<none> expire:link}
> 2021-01-22 09:37:57.980332 -0500 TCP_ADAPTOR (debug) [C11768]
> qdr_tcp_activate: waking raw connection
> 2021-01-22 09:37:57.980356 -0500 TCP_ADAPTOR (debug) [C11767]
> qdr_tcp_activate: waking raw connection
> 2021-01-22 09:37:57.980386 -0500 TCP_ADAPTOR (debug) [C11768]
> PN_RAW_CONNECTION_WAKE
> 2021-01-22 09:37:57.980404 -0500 TCP_ADAPTOR (debug) [C11768][L23564]
> qdr_tcp_second_attach
> 2021-01-22 09:37:57.980430 -0500 TCP_ADAPTOR (debug) [C11768][L23564]
> qdr_tcp_get_credit: NOOP
> 2021-01-22 09:37:57.980453 -0500 TCP_ADAPTOR (debug) [C11768][L23564]
> qdr_tcp_push
> 2021-01-22 09:37:57.980471 -0500 TCP_ADAPTOR (debug) [C11768][L23564][D46793]
> qdr_tcp_deliver Delivery event
> 2021-01-22 09:37:57.980563 -0500 TCP_ADAPTOR (debug) [C11768][L23565] Create
> Link to amqp:/_topo/0/router-b/temp.tsMXbvjL9M01_l8
> 2021-01-22 09:37:57.980591 -0500 TCP_ADAPTOR (debug) [C11768][L23564] Waiting
> for credit to initiate message
> 2021-01-22 09:37:57.980621 -0500 TCP_ADAPTOR (debug) [C11768] Writing 2040
> bytes
> 2021-01-22 09:37:57.980640 -0500 TCP_ADAPTOR (debug) [C11768] Writing 2048
> bytes
> 2021-01-22 09:37:57.980657 -0500 TCP_ADAPTOR (debug) [C11768] Writing 2048
> bytes
> 2021-01-22 09:37:57.980674 -0500 TCP_ADAPTOR (debug) [C11768] Writing 1104
> bytes
> 2021-01-22 09:37:57.980691 -0500 TCP_ADAPTOR (debug) [C11768][L23564]
> qdr_tcp_get_credit: NOOP
> Aborted (core dumped){code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]