[ 
https://issues.apache.org/jira/browse/DISPATCH-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vishal Sharda updated DISPATCH-382:
-----------------------------------
    Attachment: val_crash_2.txt
                val_crash_1.txt
                Crash_in_Valgrind_3.txt
                Crash_in_Valgrind_3.png
                Crash_in_Valgrind_2.png
                Crash_in_Valgrind_1.png

Attached 3 screenshots showing the crash and 3 output files from Valgrind for 
the corresponding runs.

Here is the information about the thread that lead to SIGABRT as per Valgrind:

==18841== Thread 2:
==18841== Invalid read of size 4
==18841==    at 0x52F7274: pthread_mutex_lock (pthread_mutex_lock.c:66)
==18841==    by 0x4E648E7: sys_mutex_lock (threading.c:70)
==18841==    by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132)
==18841==    by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405)
==18841==    by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707)
==18841==    by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581)
==18841==    by 0x4E76623: router_core_thread (router_core_thread.c:71)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841==    by 0x5F8387C: clone (clone.S:111)
==18841==  Address 0xdc42ee0 is 16 bytes inside a block of size 48 free'd
==18841==    at 0x4C28D29: free (vg_replace_malloc.c:530)
==18841==    by 0x4E648CD: sys_mutex_free (threading.c:64)
==18841==    by 0x4E6FBAF: qdr_connection_closed_CT (connections.c:972)
==18841==    by 0x4E76623: router_core_thread (router_core_thread.c:71)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841==    by 0x5F8387C: clone (clone.S:111)
==18841==  Block was alloc'd at
==18841==    at 0x4C27C0F: malloc (vg_replace_malloc.c:299)
==18841==    by 0x4E64859: sys_mutex (threading.c:51)
==18841==    by 0x4E6D111: qdr_connection_opened (connections.c:85)
==18841==    by 0x4E7D7CA: AMQP_opened_handler (router_node.c:560)
==18841==    by 0x4E7D837: AMQP_inbound_opened_handler (router_node.c:572)
==18841==    by 0x4E5397D: notify_opened (container.c:261)
==18841==    by 0x4E53A0D: policy_notify_opened (container.c:275)
==18841==    by 0x4E61B3A: qd_policy_amqp_open (policy.c:744)
==18841==    by 0x4E81BC1: invoke_deferred_calls (server.c:720)
==18841==    by 0x4E81CE7: process_connector (server.c:766)
==18841==    by 0x4E827C0: thread_run (server.c:1024)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841== 
==18841== Invalid read of size 4
==18841==    at 0x52F2A03: __pthread_mutex_lock_full (pthread_mutex_lock.c:177)
==18841==    by 0x4E648E7: sys_mutex_lock (threading.c:70)
==18841==    by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132)
==18841==    by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405)
==18841==    by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707)
==18841==    by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581)
==18841==    by 0x4E76623: router_core_thread (router_core_thread.c:71)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841==    by 0x5F8387C: clone (clone.S:111)
==18841==  Address 0xdc42ee0 is 16 bytes inside a block of size 48 free'd
==18841==    at 0x4C28D29: free (vg_replace_malloc.c:530)
==18841==    by 0x4E648CD: sys_mutex_free (threading.c:64)
==18841==    by 0x4E6FBAF: qdr_connection_closed_CT (connections.c:972)
==18841==    by 0x4E76623: router_core_thread (router_core_thread.c:71)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841==    by 0x5F8387C: clone (clone.S:111)
==18841==  Block was alloc'd at
==18841==    at 0x4C27C0F: malloc (vg_replace_malloc.c:299)
==18841==    by 0x4E64859: sys_mutex (threading.c:51)
==18841==    by 0x4E6D111: qdr_connection_opened (connections.c:85)
==18841==    by 0x4E7D7CA: AMQP_opened_handler (router_node.c:560)
==18841==    by 0x4E7D837: AMQP_inbound_opened_handler (router_node.c:572)
==18841==    by 0x4E5397D: notify_opened (container.c:261)
==18841==    by 0x4E53A0D: policy_notify_opened (container.c:275)
==18841==    by 0x4E61B3A: qd_policy_amqp_open (policy.c:744)
==18841==    by 0x4E81BC1: invoke_deferred_calls (server.c:720)
==18841==    by 0x4E81CE7: process_connector (server.c:766)
==18841==    by 0x4E827C0: thread_run (server.c:1024)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841== 
==18841== 
==18841== Process terminating with default action of signal 6 (SIGABRT)
==18841==    at 0x5ED0067: raise (raise.c:56)
==18841==    by 0x5ED1447: abort (abort.c:89)
==18841==    by 0x5EC9265: __assert_fail_base (assert.c:92)
==18841==    by 0x5EC9311: __assert_fail (assert.c:101)
==18841==    by 0x4E6490F: sys_mutex_lock (threading.c:71)
==18841==    by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132)
==18841==    by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405)
==18841==    by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707)
==18841==    by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581)
==18841==    by 0x4E76623: router_core_thread (router_core_thread.c:71)
==18841==    by 0x52F50A3: start_thread (pthread_create.c:309)
==18841==    by 0x5F8387C: clone (clone.S:111)
==18841== 


> Intermittent router crash when starting 50 receivers/0 senders and doing 
> qdstat
> -------------------------------------------------------------------------------
>
>                 Key: DISPATCH-382
>                 URL: https://issues.apache.org/jira/browse/DISPATCH-382
>             Project: Qpid Dispatch
>          Issue Type: Bug
>          Components: Routing Engine
>    Affects Versions: 0.6.0
>         Environment: Debian 8.3, Apache Qpid Proton 0.13.0-RC for drivers and 
> dependencies, Hardware: 2 CPUs, 15 GB RAM, 60 GB HDD each on 3 separate 
> machines
>            Reporter: Vishal Sharda
>            Priority: Blocker
>         Attachments: Crash_in_Valgrind_1.png, Crash_in_Valgrind_2.png, 
> Crash_in_Valgrind_3.png, Crash_in_Valgrind_3.txt, val_crash_1.txt, 
> val_crash_2.txt
>
>
> Network: A network of 3 interior routers built from trunk and connected to 
> each other using 2-way SSL.
> We ran a Proton-J Reactor API based client to start 50 receivers and 0 
> senders on one of the above 3 routers.  After that we ran "qdstat -c".  This 
> leads to intermittent crash in the router.  This crash could not be 
> reproduced while running the routers independently or inside gdb.  When we 
> run the routers inside Valgrind, this crash is frequent.  I was able to 
> reproduce the crash 3 times using Valgrind (Screenshots and Valgrind output 
> files are attached).
> This intermittent crash becomes permanent in our instrumented build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to