michael goulish created PROTON-826:
--------------------------------------
Summary: recent checkin causes frequent double-free or corruption
crash
Key: PROTON-826
URL: https://issues.apache.org/jira/browse/PROTON-826
Project: Qpid Proton
Issue Type: Bug
Components: proton-c
Affects Versions: 0.9
Reporter: michael goulish
Priority: Blocker
In my dispatch testing I am seeing frequent crashes in proton library that
began with proton checkin 01cb00c on 2015-02-15 "report read and write
errors through the transport"
The output at crash-time says this:
---------------------------------------------------
*** Error in `/home/mick/dispatch/install/sbin/qdrouterd': double free or
corruption (fasttop): 0x00000000020ee880 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e3d875a4f]
/lib64/libc.so.6[0x3e3d87cd78]
/lib64/libqpid-proton.so.2(pn_error_clear+0x18)[0x7f4f4f4e1f18]
/lib64/libqpid-proton.so.2(pn_error_set+0x11)[0x7f4f4f4e1f41]
/lib64/libqpid-proton.so.2(pn_error_vformat+0x3e)[0x7f4f4f4e1f9e]
/lib64/libqpid-proton.so.2(pn_error_format+0x82)[0x7f4f4f4e2032]
/lib64/libqpid-proton.so.2(pn_i_error_from_errno+0x67)[0x7f4f4f4fd737]
/lib64/libqpid-proton.so.2(pn_recv+0x5a)[0x7f4f4f4fd16a]
/home/mick/dispatch/install/lib64/libqpid-dispatch.so.0(qdpn_connector_process+0xd7)[0x7f4f4f759430]
The backtrace from the core file looks like this:
------------------------------------------------------------
#0 0x0000003e3d835877 in raise () from /lib64/libc.so.6
#1 0x0000003e3d836f68 in abort () from /lib64/libc.so.6
#2 0x0000003e3d875a54 in __libc_message () from /lib64/libc.so.6
#3 0x0000003e3d87cd78 in _int_free () from /lib64/libc.so.6
#4 0x00007fbf8a59b2e8 in pn_error_clear (error=error@entry=0x1501140)
at /home/mick/rh-qpid-proton/proton-c/src/error.c:56
#5 0x00007fbf8a59b311 in pn_error_set (error=error@entry=0x1501140,
code=code@entry=-2,
text=text@entry=0x7fbf801a69c0 "recv: Resource temporarily unavailable")
at /home/mick/rh-qpid-proton/proton-c/src/error.c:65
#6 0x00007fbf8a59b36e in pn_error_vformat (error=0x1501140, code=-2,
fmt=<optimized out>,
ap=ap@entry=0x7fbf801a6de8) at
/home/mick/rh-qpid-proton/proton-c/src/error.c:81
#7 0x00007fbf8a59b402 in pn_error_format (error=error@entry=0x1501140,
code=<optimized out>,
fmt=fmt@entry=0x7fbf8a5bb21e "%s: %s") at
/home/mick/rh-qpid-proton/proton-c/src/error.c:89
#8 0x00007fbf8a5b6797 in pn_i_error_from_errno (error=0x1501140,
msg=msg@entry=0x7fbf8a5bbe1a "recv")
at /home/mick/rh-qpid-proton/proton-c/src/platform.c:119
#9 0x00007fbf8a5b61ca in pn_recv (io=0x14e77b0, socket=<optimized out>,
buf=<optimized out>,
size=<optimized out>) at
/home/mick/rh-qpid-proton/proton-c/src/posix/io.c:271
#10 0x00007fbf8a812430 in qdpn_connector_process (c=0x7fbf7801c7f0)
---------------------------------------------------------
And I can prevent the crash from happening, apparently forever, by commenting
out this line:
free(error->text);
in the function pn_error_clear
in the file proton-c/src/error.c
The error text that is being freed which causes the crash looks like this:
$2 = {text = 0x7f66e8104e30 "recv: Resource temporarily unavailable", root =
0x0, code = -2}
My dispatch test creates a router network and then repeatedly kills and
restarts a randomly-selected router. After this proton checkin it almost never
gets through 5 iterations without this crash. After I commented out that line,
it got through more than 500 iterations before I stopped it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)