[ 
https://issues.apache.org/jira/browse/DISPATCH-963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16439506#comment-16439506
 ] 

Chuck Rolke commented on DISPATCH-963:
--------------------------------------

Using today's proton master (commit a80d54e6) the same tests crashes in other 
ways:
{quote}{{Core was generated by `qdrouterd -c B.conf -I 
/home/chug/git/qpid-dispatch/python'.}}
{{Program terminated with signal SIGSEGV, Segmentation fault.}}
{{[Current thread is 1 (Thread 0x7f0f57d11700 (LWP 20186))]}}
{{(gdb) p buf}}
{{$1 = (pn_buffer_t *) 0x7f0f00000003}}
{{(gdb) p *buf}}
{{Cannot access memory at address 0x7f0f00000003}}
{{(gdb) bt}}
{{#0  0x00007f0f661d069e in pn_buffer_clear (buf=0x7f0f00000003) at 
/home/chug/git/qpid-proton/c/src/core/buffer.c:257}}
{{#1  0x00007f0f661d1e35 in pn_data_clear (data=0x7f0f38123ef0) at 
/home/chug/git/qpid-proton/c/src/core/codec.c:414}}
{{#2  0x00007f0f661d659e in pn_data_copy (data=0x7f0f38123ef0, 
src=0x7f0f40172860) at /home/chug/git/qpid-proton/c/src/core/codec.c:1987}}
{{#3  0x00007f0f664671f4 in qdr_terminus_copy (from=0x7f0f40093ca0, 
to=0x7f0f38117fd8) at 
/home/chug/git/qpid-dispatch/src/router_core/terminus.c:118}}
{{#4  0x00007f0f6646e17c in CORE_link_second_attach (context=0x26089d0, 
link=0x7f0f380f0a20, source=0x7f0f40093ca0, target=0x7f0f40093ba0) at 
/home/chug/git/qpid-dispatch/src/router_node.c:1177}}
{{#5  0x00007f0f66452cb3 in qdr_connection_process (conn=0x2789a60) at 
/home/chug/git/qpid-dispatch/src/router_core/connections.c:231}}
{{#6  0x00007f0f6646b781 in AMQP_writable_conn_handler (type_context=0x26089d0, 
conn=0x276da60, context=0x0) at 
/home/chug/git/qpid-dispatch/src/router_node.c:167}}
{{#7  0x00007f0f66431b6b in writable_handler (container=0x24810e0, 
conn=0x2763040, qd_conn=0x276da60) at 
/home/chug/git/qpid-dispatch/src/container.c:326}}
{{#8  0x00007f0f66432577 in qd_container_handle_event (container=0x24810e0, 
event=0x7f0f38145710) at /home/chug/git/qpid-dispatch/src/container.c:548}}
{{#9  0x00007f0f664730a2 in handle (qd_server=0x25eb820, e=0x7f0f38145710) at 
/home/chug/git/qpid-dispatch/src/server.c:940}}
{{#10 0x00007f0f66473125 in thread_run (arg=0x25eb820) at 
/home/chug/git/qpid-dispatch/src/server.c:958}}
{{#11 0x00007f0f65d9950b in start_thread (arg=0x7f0f57d11700) at 
pthread_create.c:465}}
{{#12 0x00007f0f6505d16f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95}}{quote}
This seems to be very repeatable. It fails between tests 21 and 22:
{quote}{{30: test_21_linkroute_mesh_nonlocal 
(system_tests_distribution.DistributionTests) ... ok}}
{{30/46 Test #30: system_tests_distribution .........................***Timeout 
600.12 sec}}{quote}
 

 

 

> Router crash during shutdown in system_tests_distribution
> ---------------------------------------------------------
>
>                 Key: DISPATCH-963
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-963
>             Project: Qpid Dispatch
>          Issue Type: Improvement
>          Components: Tests
>    Affects Versions: 1.0.1
>            Reporter: Ganesh Murthy
>            Priority: Major
>
> The router crashes during shutdown in system_tests_distribution.py
> Here is the backtrace
>  
> {noformat}
> (gdb) bt
> #0  0x00007f361ca5ae40 in pn_ep_decref (endpoint=0x7f35f01c2dd0) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/engine.c:447
> #1  0x00007f361ca5b58b in pn_ep_decref (endpoint=<optimized out>) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/engine.c:445
> #2  0x00007f361ca5f588 in pni_transport_unbind_handles 
> (handles=0x7f35f00764a0, reset_state=reset_state@entry=true) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/transport.c:748
> #3  0x00007f361ca5f666 in pni_transport_unbind_channels (channels=0x9d1ce0) 
> at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/transport.c:761
> #4  0x00007f361ca5f777 in pn_transport_unbind (transport=0xa863d0) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/transport.c:795
> #5  0x00007f361ca5a63e in pn_connection_driver_release_connection 
> (d=d@entry=0xa86248) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/connection_driver.c:81
> #6  0x00007f361ca5a679 in pn_connection_driver_destroy (d=d@entry=0xa86248) 
> at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/core/connection_driver.c:92
> #7  0x00007f361c83a69c in pconnection_final_free (pc=0xa85ca0) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/proactor/epoll.c:827
> #8  0x00007f361c83b3ac in pconnection_cleanup (pc=<optimized out>) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/proactor/epoll.c:843
> #9  0x00007f361c83db37 in pconnection_forced_shutdown (pc=0xa85ca0) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/proactor/epoll.c:878
> #10 pn_proactor_free (p=0x916fd0) at 
> /home/gmurthy/opensource/qpid-proton-0.22.0/proton-c/src/proactor/epoll.c:1815
> #11 0x00007f361cce7bb5 in qd_server_free (qd_server=0x919190) at 
> /home/gmurthy/opensource/qpid-dispatch/src/server.c:1176
> #12 0x00007f361cca878e in qd_dispatch_free (qd=0x6164b0) at 
> /home/gmurthy/opensource/qpid-dispatch/src/dispatch.c:318
> #13 0x0000000000401864 in main_process (config_path=0x7ffcb36d88e2 "B.conf", 
> python_pkgdir=0x7ffcb36d88ec "/home/gmurthy/opensource/qpid-dispatch/python", 
> fd=2) at /home/gmurthy/opensource/qpid-dispatch/router/src/main.c:116
> #14 0x00000000004022b0 in main (argc=5, argv=0x7ffcb36d8158) at 
> /home/gmurthy/opensource/qpid-dispatch/router/src/main.c:360
> (gdb){noformat}
>  
> Running the test under valgrind, it seems that the pn_proactor_free is trying 
> to free already freed link endpoint. Here are two outputs from valgrind
>  
> {noformat}
> Process 3493 error: exit code 42, expected 0
> qdrouterd -c B.conf -I /home/gmurthy/opensource/qpid-dispatch/python
> /home/gmurthy/opensource/qpid-dispatch/build/system_test.dir/system_tests_distribution/DistributionTests/setUpClass/B-2.cmd
> >>>>
> ==3493== Invalid write of size 8
> ==3493==    at 0x50E82B0: pn_link_unbound (engine.c:1202)
> ==3493==    by 0x50EB5D0: pni_transport_unbind_handles (transport.c:746)
> ==3493==    by 0x50EB665: pni_transport_unbind_channels (transport.c:761)
> ==3493==    by 0x50EB776: pn_transport_unbind (transport.c:795)
> ==3493==    by 0x50E663D: pn_connection_driver_release_connection 
> (connection_driver.c:81)
> ==3493==    by 0x50E6678: pn_connection_driver_destroy 
> (connection_driver.c:92)
> ==3493==    by 0x530A69B: pconnection_final_free (epoll.c:827)
> ==3493==    by 0x530DB36: pconnection_forced_shutdown (epoll.c:878)
> ==3493==    by 0x530DB36: pn_proactor_free (epoll.c:1815)
> ==3493==    by 0x4EA5BB4: qd_server_free (server.c:1176)
> ==3493==    by 0x4E6678D: qd_dispatch_free (dispatch.c:318)
> ==3493==    by 0x401863: main_process (main.c:116)
> ==3493==    by 0x4022AF: main (main.c:360)
> ==3493==  Address 0x15c06188 is 376 bytes inside a block of size 488 free'd
> ==3493==    at 0x4C2DD18: free (vg_replace_malloc.c:530)
> ==3493==    by 0x50DD938: pn_class_decref (object.c:101)
> ==3493==    by 0x50EA03F: pn_event_finalize (event.c:226)
> ==3493==    by 0x50EA03F: pn_event_finalize_cast (event.c:271)
> ==3493==    by 0x50DD928: pn_class_decref (object.c:95)
> ==3493==    by 0x50EA361: pn_collector_next (event.c:197)
> ==3493==    by 0x50E6408: batch_next (connection_driver.c:51)
> ==3493==    by 0x530C544: pconnection_batch_next (epoll.c:884)
> ==3493==    by 0x4EA50F2: thread_run (server.c:957)
> ==3493==    by 0x551950A: start_thread (in /usr/lib64/libpthread-2.26.so)
> ==3493==    by 0x629916E: clone (in /usr/lib64/libc-2.26.so)
> ==3493==  Block was alloc'd at
> ==3493==    at 0x4C2EA1E: calloc (vg_replace_malloc.c:711)
> ==3493==    by 0x50DD811: pn_object_new (object.c:202)
> ==3493==    by 0x50DD88B: pn_class_new (object.c:61)
> ==3493==    by 0x50E8163: pn_link_new (engine.c:1153)
> ==3493==    by 0x50ED3C4: pn_do_attach (transport.c:1366)
> ==3493==    by 0x50E6227: pni_dispatch_action (dispatcher.c:74)
> ==3493==    by 0x50E6227: pni_dispatch_frame (dispatcher.c:116)
> ==3493==    by 0x50E6227: pn_dispatcher_input (dispatcher.c:135)
> ==3493==    by 0x50EC96B: pn_input_read_amqp (transport.c:2561)
> ==3493==    by 0x50ECA17: transport_consume (transport.c:1817)
> ==3493==    by 0x50F0135: pn_transport_process (transport.c:2946)
> ==3493==    by 0x530C43F: pconnection_process (epoll.c:1128)
> ==3493==    by 0x530C9EA: proactor_do_epoll (epoll.c:2010)
> ==3493==    by 0x4EA50C4: thread_run (server.c:955)
> {noformat}
>  
> {noformat}
> ==3488== Invalid read of size 8
> ==3488==    at 0x50E6E11: pni_ep_get_connection (engine.c:50)
> ==3488==    by 0x50E6E11: pn_ep_decref.part.10 (engine.c:446)
> ==3488==    by 0x50EB587: pni_transport_unbind_handles (transport.c:748)
> ==3488==    by 0x50EB665: pni_transport_unbind_channels (transport.c:761)
> ==3488==    by 0x50EB776: pn_transport_unbind (transport.c:795)
> ==3488==    by 0x50E663D: pn_connection_driver_release_connection 
> (connection_driver.c:81)
> ==3488==    by 0x50E6678: pn_connection_driver_destroy 
> (connection_driver.c:92)
> ==3488==    by 0x530A69B: pconnection_final_free (epoll.c:827)
> ==3488==    by 0x530DF4A: pconnection_done (epoll.c:964)
> ==3488==    by 0x530DF4A: pn_proactor_done (epoll.c:2038)
> ==3488==    by 0x4EA5114: thread_run (server.c:960)
> ==3488==    by 0x551950A: start_thread (in /usr/lib64/libpthread-2.26.so)
> ==3488==    by 0x629916E: clone (in /usr/lib64/libc-2.26.so)
> ==3488==  Address 0x127e15d0 is 400 bytes inside a block of size 488 free'd
> ==3488==    at 0x4C2DD18: free (vg_replace_malloc.c:530)
> ==3488==    by 0x50DD938: pn_class_decref (object.c:101)
> ==3488==    by 0x50EA03F: pn_event_finalize (event.c:226)
> ==3488==    by 0x50EA03F: pn_event_finalize_cast (event.c:271)
> ==3488==    by 0x50DD928: pn_class_decref (object.c:95)
> ==3488==    by 0x50EA361: pn_collector_next (event.c:197)
> ==3488==    by 0x50E6408: batch_next (connection_driver.c:51)
> ==3488==    by 0x530C544: pconnection_batch_next (epoll.c:884)
> ==3488==    by 0x4EA50F2: thread_run (server.c:957)
> ==3488==    by 0x551950A: start_thread (in /usr/lib64/libpthread-2.26.so)
> ==3488==    by 0x629916E: clone (in /usr/lib64/libc-2.26.so)
> ==3488==  Block was alloc'd at
> ==3488==    at 0x4C2EA1E: calloc (vg_replace_malloc.c:711)
> ==3488==    by 0x50DD811: pn_object_new (object.c:202)
> ==3488==    by 0x50DD88B: pn_class_new (object.c:61)
> ==3488==    by 0x50E8163: pn_link_new (engine.c:1153)
> ==3488==    by 0x50ED3C4: pn_do_attach (transport.c:1366)
> ==3488==    by 0x50E6227: pni_dispatch_action (dispatcher.c:74)
> ==3488==    by 0x50E6227: pni_dispatch_frame (dispatcher.c:116)
> ==3488==    by 0x50E6227: pn_dispatcher_input (dispatcher.c:135)
> ==3488==    by 0x50EC96B: pn_input_read_amqp (transport.c:2561)
> ==3488==    by 0x50ECA17: transport_consume (transport.c:1817)
> ==3488==    by 0x50F0135: pn_transport_process (transport.c:2946)
> ==3488==    by 0x530C43F: pconnection_process (epoll.c:1128)
> ==3488==    by 0x530C9EA: proactor_do_epoll (epoll.c:2010)
> ==3488==    by 0x4EA50C4: thread_run (server.c:955)
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to