[ https://issues.apache.org/jira/browse/DISPATCH-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vishal Sharda updated DISPATCH-382: ----------------------------------- Attachment: val_crash_2.txt val_crash_1.txt Crash_in_Valgrind_3.txt Crash_in_Valgrind_3.png Crash_in_Valgrind_2.png Crash_in_Valgrind_1.png Attached 3 screenshots showing the crash and 3 output files from Valgrind for the corresponding runs. Here is the information about the thread that lead to SIGABRT as per Valgrind: ==18841== Thread 2: ==18841== Invalid read of size 4 ==18841== at 0x52F7274: pthread_mutex_lock (pthread_mutex_lock.c:66) ==18841== by 0x4E648E7: sys_mutex_lock (threading.c:70) ==18841== by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132) ==18841== by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405) ==18841== by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707) ==18841== by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581) ==18841== by 0x4E76623: router_core_thread (router_core_thread.c:71) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== by 0x5F8387C: clone (clone.S:111) ==18841== Address 0xdc42ee0 is 16 bytes inside a block of size 48 free'd ==18841== at 0x4C28D29: free (vg_replace_malloc.c:530) ==18841== by 0x4E648CD: sys_mutex_free (threading.c:64) ==18841== by 0x4E6FBAF: qdr_connection_closed_CT (connections.c:972) ==18841== by 0x4E76623: router_core_thread (router_core_thread.c:71) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== by 0x5F8387C: clone (clone.S:111) ==18841== Block was alloc'd at ==18841== at 0x4C27C0F: malloc (vg_replace_malloc.c:299) ==18841== by 0x4E64859: sys_mutex (threading.c:51) ==18841== by 0x4E6D111: qdr_connection_opened (connections.c:85) ==18841== by 0x4E7D7CA: AMQP_opened_handler (router_node.c:560) ==18841== by 0x4E7D837: AMQP_inbound_opened_handler (router_node.c:572) ==18841== by 0x4E5397D: notify_opened (container.c:261) ==18841== by 0x4E53A0D: policy_notify_opened (container.c:275) ==18841== by 0x4E61B3A: qd_policy_amqp_open (policy.c:744) ==18841== by 0x4E81BC1: invoke_deferred_calls (server.c:720) ==18841== by 0x4E81CE7: process_connector (server.c:766) ==18841== by 0x4E827C0: thread_run (server.c:1024) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== ==18841== Invalid read of size 4 ==18841== at 0x52F2A03: __pthread_mutex_lock_full (pthread_mutex_lock.c:177) ==18841== by 0x4E648E7: sys_mutex_lock (threading.c:70) ==18841== by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132) ==18841== by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405) ==18841== by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707) ==18841== by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581) ==18841== by 0x4E76623: router_core_thread (router_core_thread.c:71) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== by 0x5F8387C: clone (clone.S:111) ==18841== Address 0xdc42ee0 is 16 bytes inside a block of size 48 free'd ==18841== at 0x4C28D29: free (vg_replace_malloc.c:530) ==18841== by 0x4E648CD: sys_mutex_free (threading.c:64) ==18841== by 0x4E6FBAF: qdr_connection_closed_CT (connections.c:972) ==18841== by 0x4E76623: router_core_thread (router_core_thread.c:71) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== by 0x5F8387C: clone (clone.S:111) ==18841== Block was alloc'd at ==18841== at 0x4C27C0F: malloc (vg_replace_malloc.c:299) ==18841== by 0x4E64859: sys_mutex (threading.c:51) ==18841== by 0x4E6D111: qdr_connection_opened (connections.c:85) ==18841== by 0x4E7D7CA: AMQP_opened_handler (router_node.c:560) ==18841== by 0x4E7D837: AMQP_inbound_opened_handler (router_node.c:572) ==18841== by 0x4E5397D: notify_opened (container.c:261) ==18841== by 0x4E53A0D: policy_notify_opened (container.c:275) ==18841== by 0x4E61B3A: qd_policy_amqp_open (policy.c:744) ==18841== by 0x4E81BC1: invoke_deferred_calls (server.c:720) ==18841== by 0x4E81CE7: process_connector (server.c:766) ==18841== by 0x4E827C0: thread_run (server.c:1024) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== ==18841== ==18841== Process terminating with default action of signal 6 (SIGABRT) ==18841== at 0x5ED0067: raise (raise.c:56) ==18841== by 0x5ED1447: abort (abort.c:89) ==18841== by 0x5EC9265: __assert_fail_base (assert.c:92) ==18841== by 0x5EC9311: __assert_fail (assert.c:101) ==18841== by 0x4E6490F: sys_mutex_lock (threading.c:71) ==18841== by 0x4E70EDC: qdr_forward_deliver_CT (forwarder.c:132) ==18841== by 0x4E71D4A: qdr_forward_closest_CT (forwarder.c:405) ==18841== by 0x4E72A96: qdr_forward_message_CT (forwarder.c:707) ==18841== by 0x4E7C0B9: qdr_send_to_CT (transfer.c:581) ==18841== by 0x4E76623: router_core_thread (router_core_thread.c:71) ==18841== by 0x52F50A3: start_thread (pthread_create.c:309) ==18841== by 0x5F8387C: clone (clone.S:111) ==18841== > Intermittent router crash when starting 50 receivers/0 senders and doing > qdstat > ------------------------------------------------------------------------------- > > Key: DISPATCH-382 > URL: https://issues.apache.org/jira/browse/DISPATCH-382 > Project: Qpid Dispatch > Issue Type: Bug > Components: Routing Engine > Affects Versions: 0.6.0 > Environment: Debian 8.3, Apache Qpid Proton 0.13.0-RC for drivers and > dependencies, Hardware: 2 CPUs, 15 GB RAM, 60 GB HDD each on 3 separate > machines > Reporter: Vishal Sharda > Priority: Blocker > Attachments: Crash_in_Valgrind_1.png, Crash_in_Valgrind_2.png, > Crash_in_Valgrind_3.png, Crash_in_Valgrind_3.txt, val_crash_1.txt, > val_crash_2.txt > > > Network: A network of 3 interior routers built from trunk and connected to > each other using 2-way SSL. > We ran a Proton-J Reactor API based client to start 50 receivers and 0 > senders on one of the above 3 routers. After that we ran "qdstat -c". This > leads to intermittent crash in the router. This crash could not be > reproduced while running the routers independently or inside gdb. When we > run the routers inside Valgrind, this crash is frequent. I was able to > reproduce the crash 3 times using Valgrind (Screenshots and Valgrind output > files are attached). > This intermittent crash becomes permanent in our instrumented build. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org For additional commands, e-mail: dev-h...@qpid.apache.org