michael goulish created PROTON-826:
--------------------------------------

             Summary: recent checkin causes frequent double-free or corruption 
crash
                 Key: PROTON-826
                 URL: https://issues.apache.org/jira/browse/PROTON-826
             Project: Qpid Proton
          Issue Type: Bug
          Components: proton-c
    Affects Versions: 0.9
            Reporter: michael goulish
            Priority: Blocker


In my dispatch testing I am seeing frequent crashes in proton library that 
began with proton checkin   01cb00c  on 2015-02-15   "report read and write 
errors through the transport"



The output at crash-time says this:
---------------------------------------------------

*** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or 
corruption (fasttop): 0x00000000020ee880 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e3d875a4f]
/lib64/libc.so.6[0x3e3d87cd78]
/lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
/lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
/lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
/lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
/lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
/lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
/home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]




The backtrace from the core file looks like this:
------------------------------------------------------------

    #0  0x0000003e3d835877 in raise () from /lib64/libc.so.6
    #1  0x0000003e3d836f68 in abort () from /lib64/libc.so.6
    #2  0x0000003e3d875a54 in __libc_message () from /lib64/libc.so.6
    #3  0x0000003e3d87cd78 in _int_free () from /lib64/libc.so.6
    #4  0x00007fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
    at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
    #5  0x00007fbf8a59b311 in pn_error_set (error=error@entry=0x1501140, 
code=code@entry=-2,
    text=text@entry=0x7fbf801a69c0 "recv: Resource temporarily unavailable")
    at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
    #6  0x00007fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2, 
fmt=<optimized out>,
    ap=ap@entry=0x7fbf801a6de8) at 
/home/mick/rh-qpid-proton/proton-c/src/error.c:81
    #7  0x00007fbf8a59b402 in pn_error_format (error=error@entry=0x1501140, 
code=<optimized out>,
    fmt=fmt@entry=0x7fbf8a5bb21e "%s: %s") at 
/home/mick/rh-qpid-proton/proton-c/src/error.c:89
    #8  0x00007fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
    msg=msg@entry=0x7fbf8a5bbe1a "recv")
    at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
    #9  0x00007fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=<optimized out>, 
buf=<optimized out>,
    size=<optimized out>) at 
/home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
    #10 0x00007fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)

---------------------------------------------------------

And I can prevent the crash from happening, apparently forever, by commenting 
out this line:
  free(error->text);
in the function  pn_error_clear
in the file proton-c/src/error.c

The error text that is being freed which causes the crash looks like this:
  $2 = {text = 0x7f66e8104e30 "recv: Resource temporarily unavailable", root = 
0x0, code = -2}


My dispatch test creates a router network and then repeatedly kills and 
restarts a randomly-selected router.  After this proton checkin it almost never 
gets through 5 iterations without this crash.  After I commented out that line, 
it got through more than 500 iterations before I stopped it.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to