[ 
https://issues.apache.org/jira/browse/PROTON-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445250#comment-17445250
 ] 

Ken Giusti commented on PROTON-2466:
------------------------------------

This is a difficult issue to reproduce.  In my experience it can take a few 
hours and the resulting log files are huge.

To reproduce:
 # check out head of the qdrouter 1.18.x branch
 # back out the pointer clear patch that prevents the crash from occurring:
 ## commit 6734891419fcafdbc87d40eca269d07821c1b813 DISPATCH-2286: reset the 
raw conn context when handling disconnect
 # run two routers using the above configurations:
 ## rm -f qdrouterd-A-log.txt ; qdrouterd -c qdrouterd-A.conf & rm -f 
qdrouterd-B-log.txt ; qdrouterd -c qdrouterd-B.conf &
 # Install iperf3
 # spawn an iperf3 server for the router to connected to:
 ## iperf3 -s -p 8080 &
 # run iperf3 clients to generate traffic in a loop:
 ## while iperf3 -c 127.0.0.1 -p 8000 -t 5 -P 8; do echo "OK"; sleep 2; done
 # wait for crash

> raw connection posts wake events after disconnect event is handled
> ------------------------------------------------------------------
>
>                 Key: PROTON-2466
>                 URL: https://issues.apache.org/jira/browse/PROTON-2466
>             Project: Qpid Proton
>          Issue Type: Bug
>          Components: proton-c
>    Affects Versions: proton-c-0.36.0
>            Reporter: Ken Giusti
>            Priority: Major
>         Attachments: qdrouterd-A.conf, qdrouterd-B.conf
>
>
> While running tcp stress tests against qdrouterd a crash occurred.  The crash 
> was due to a stale pointer dereference.
> qdrouterd code has been patched to properly clear the pointer and check for 
> null in the effected codepath.  However...
> ... the access occurred while processing a PN_RAW_CONNECTION_WAKE event that 
> arrived on a raw connection *after* a PN_RAW_CONNECTION_DISCONNECTED event 
> previously arrived on the raw connection.
> IIUC the PN_RAW_CONNECTION_DISCONNECTED event is supposed to be the last 
> event generated on a raw connection, and once that event has been handled the 
> raw connection is released.   If that is correct then the arrival of the 
> following WAKE event is a bug.
> Here is the log output from the router just prior to the crash (filtered on 
> the affected connection):
> $ tail C140.txt                                                               
>                                
> 2021-11-16 17:11:10.925728 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_WAKE connector                                              
>         
> 2021-11-16 17:11:10.926990 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_WAKE connector                                              
>         
> 2021-11-16 17:11:10.927001 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_READ connector Event                                        
>         
> 2021-11-16 17:11:10.927034 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_READ Read 0 bytes. Total read 0 bytes                       
>         
> 2021-11-16 17:11:10.927596 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_WRITTEN connector pn_raw_connection_take_written_buffers 
> wrote 3276\
> 8 bytes. Total written 36929573 bytes                                         
>                                                                     
> 2021-11-16 17:11:10.928207 -0500 TCP_ADAPTOR (debug) [C140][L322] 
> PN_RAW_CONNECTION_CLOSED_READ connector                                       
>   
> 2021-11-16 17:11:10.928591 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_CLOSED_WRITE connector                                      
>         
> 2021-11-16 17:11:10.929160 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_WRITTEN connector pn_raw_connection_take_written_buffers 
> wrote 3276\
> 8 bytes. Total written 36962341 bytes                                         
>                                                                     
> *2021-11-16 17:11:10.929410 -0500 TCP_ADAPTOR (info) [C140] 
> PN_RAW_CONNECTION_DISCONNECTED connector* 
> *2021-11-16 17:11:10.929915 -0500 TCP_ADAPTOR (debug) [C140] 
> PN_RAW_CONNECTION_WAKE connector*



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to