michael goulish created PROTON-826: -------------------------------------- Summary: recent checkin causes frequent double-free or corruption crash Key: PROTON-826 URL: https://issues.apache.org/jira/browse/PROTON-826 Project: Qpid Proton Issue Type: Bug Components: proton-c Affects Versions: 0.9 Reporter: michael goulish Priority: Blocker
In my dispatch testing I am seeing frequent crashes in proton library that began with proton checkin 01cb00c on 2015-02-15 "report read and write errors through the transport" The output at crash-time says this: --------------------------------------------------- *** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or corruption (fasttop): 0x00000000020ee880 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3e3d875a4f] /lib64/libc.so.6[0x3e3d87cd78] /lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18] /lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41] /lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e] /lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032] /lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737] /lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a] /home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430] The backtrace from the core file looks like this: ------------------------------------------------------------ #0 0x0000003e3d835877 in raise () from /lib64/libc.so.6 #1 0x0000003e3d836f68 in abort () from /lib64/libc.so.6 #2 0x0000003e3d875a54 in __libc_message () from /lib64/libc.so.6 #3 0x0000003e3d87cd78 in _int_free () from /lib64/libc.so.6 #4 0x00007fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140) at /home/mick/rh-qpid-proton/proton-c/src/error.c:56 #5 0x00007fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, code=code@entry=-2, text=text@entry=0x7fbf801a69c0 "recv: Resource temporarily unavailable") at /home/mick/rh-qpid-proton/proton-c/src/error.c:65 #6 0x00007fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, fmt=<optimized out>, ap=ap@entry=0x7fbf801a6de8) at /home/mick/rh-qpid-proton/proton-c/src/error.c:81 #7 0x00007fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, code=<optimized out>, fmt=fmt@entry=0x7fbf8a5bb21e "%s: %s") at /home/mick/rh-qpid-proton/proton-c/src/error.c:89 #8 0x00007fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140, msg=msg@entry=0x7fbf8a5bbe1a "recv") at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119 #9 0x00007fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=<optimized out>, buf=<optimized out>, size=<optimized out>) at /home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271 #10 0x00007fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0) --------------------------------------------------------- And I can prevent the crash from happening, apparently forever, by commenting out this line: free(error->text); in the function pn_error_clear in the file proton-c/src/error.c The error text that is being freed which causes the crash looks like this: $2 = {text = 0x7f66e8104e30 "recv: Resource temporarily unavailable", root = 0x0, code = -2} My dispatch test creates a router network and then repeatedly kills and restarts a randomly-selected router. After this proton checkin it almost never gets through 5 iterations without this crash. After I commented out that line, it got through more than 500 iterations before I stopped it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)