[
https://issues.apache.org/jira/browse/TS-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13699652#comment-13699652
]
Leif Hedstrom commented on TS-1990:
-----------------------------------
The patch seems very reasonable, should we commit this ?
> core at CacheContinuation::handleDisposeEvent()
> -----------------------------------------------
>
> Key: TS-1990
> URL: https://issues.apache.org/jira/browse/TS-1990
> Project: Traffic Server
> Issue Type: Bug
> Components: Cache, Clustering
> Affects Versions: 3.3.4
> Reporter: Yunkai Zhang
> Labels: crash
> Fix For: 3.3.5
>
> Attachments:
> 0001-TS-1990-Fix-core-at-CacheContinuation-handleDisposeE.patch
>
>
> {code}
> Core was generated by `/usr/bin/traffic_server -M --httpport
> 8080:fd=10,80:fd=11'.
> Program terminated with signal 11, Segmentation fault.
> #0 CacheContinuation::handleDisposeEvent (event=<value optimized out>,
> cc=0x2b2a34575870) at ClusterCache.cc:1969
> 1969 cc->tunnel->vioTarget->reenable();
> Missing separate debuginfos, use: debuginfo-install
> expat-2.0.1-9.1.el6.x86_64 glibc-2.12-1.47.el6.x86_64
> keyutils-libs-1.4-3.el6.x86_64 krb5-libs-1.9-22.el6.x86_64
> libcom_err-1.41.12-11.el6.x86_64 libgcc-4.4.6-3.el6.x86_64
> libselinux-2.0.94-5.2.el6.x86_64 libstdc++-4.4.6-3.el6.x86_64
> ncurses-libs-5.7-3.20090208.el6.x86_64 openssl-1.0.0-20.el6.x86_64
> pcre-7.8-3.1.el6.x86_64 readline-6.0-3.el6.x86_64 tcl-8.5.7-6.el6.x86_64
> xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-27.el6.x86_64
> (gdb) l
> 1964 // Start tunnel by reenabling source and target VCs.
> 1965 cc->tunnel->in_handleDisposeEvent = true;
> 1966
> 1967 cc->tunnel->vioSource->nbytes =
> getObjectSize(cc->tunnel->vioSource->vc_server, cc->request_opcode, 0);
> 1968 cc->tunnel->vioSource->reenable_re();
> 1969 cc->tunnel->vioTarget->reenable();
> 1970
> 1971 // Tell tunnel event we are gone
> 1972 cc->tunnel_cont->action.continuation = 0;
> 1973
> (gdb) p cc->tunnel
> $1 = (OneWayTunnel *) 0x0
> (gdb) bt
> #0 CacheContinuation::handleDisposeEvent (event=<value optimized out>,
> cc=0x2b2a34575870) at ClusterCache.cc:1969
> #1 0x0000000000607421 in CacheContinuation::disposeOfDataBuffer (d=<value
> optimized out>) at ClusterCache.cc:1946
> #2 0x0000000000619d9b in ClusterControl::free_data (this=0x2b2b201e8440) at
> ClusterHandlerBase.cc:138
> #3 0x00000000006130d8 in ClusterHandler::update_channels_written
> (this=0x2b2b101e22d0, bump_unhandled_channels=<value optimized out>)
> at ClusterHandler.cc:1570
> #4 0x00000000006182ea in ClusterHandler::process_write (this=0x2b2b101e22d0,
> now=1372650704886648000, only_write_control_msgs=false)
> at ClusterHandler.cc:3080
> #5 0x0000000000618874 in ClusterHandler::mainClusterEvent
> (this=0x2b2b101e22d0, event=<value optimized out>, e=<value optimized out>)
> at ClusterHandler.cc:2595
> #6 0x000000000061ba5c in handleEvent (this=0x2b2b101e2f90) at
> ../../iocore/eventsystem/I_Continuation.h:146
> #7 ClusterState::IOComplete (this=0x2b2b101e2f90) at
> ClusterHandlerBase.cc:585
> #8 0x000000000061bcb4 in ClusterState::doIO_write_event
> (this=0x2b2b101e2f90, event=103, d=0x2b2a38011dd0) at
> ClusterHandlerBase.cc:544
> #9 0x0000000000687f11 in handleEvent (event=<value optimized out>,
> nh=0x2b2a2ac9e268, vc=0x2b2a38011c60) at
> ../../iocore/eventsystem/I_Continuation.h:146
> #10 write_signal_and_update (event=<value optimized out>, nh=0x2b2a2ac9e268,
> vc=0x2b2a38011c60) at UnixNetVConnection.cc:153
> #11 write_signal_done (event=<value optimized out>, nh=0x2b2a2ac9e268,
> vc=0x2b2a38011c60) at UnixNetVConnection.cc:180
> #12 0x000000000068b4e7 in write_to_net_io (nh=0x2b2a2ac9e268,
> vc=0x2b2a38011c60, thread=<value optimized out>) at UnixNetVConnection.cc:479
> #13 0x00000000006826d6 in NetHandler::mainNetEvent (this=0x2b2a2ac9e268,
> event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:394
> #14 0x00000000006ab654 in handleEvent (this=0x2b2a2ac9b010, e=0x2b2a14164e20,
> calling_code=5) at I_Continuation.h:146
> #15 EThread::process_event (this=0x2b2a2ac9b010, e=0x2b2a14164e20,
> calling_code=5) at UnixEThread.cc:142
> #16 0x00000000006abff3 in EThread::execute (this=0x2b2a2ac9b010) at
> UnixEThread.cc:266
> #17 0x00000000006aa5f2 in spawn_thread_internal (a=0x2b2a141cca50) at
> Thread.cc:88
> #18 0x0000003984e077f1 in start_thread () from /lib64/libpthread.so.0
> #19 0x0000003984ae570d in clone () from /lib64/libc.so.6
> {code}
> And I have found the root cause that cc->tunnel->vioSource->reenable_re() may
> free the tunnel in some case, the following satck trace is debuging info I
> added inside tunnleClosedEvent():
> {code}
> [Jul 1 11:51:44.886] Server {0x2b2a2b9a7700} NOTE: ---start---
> /usr/bin/traffic_server - STACK TRACE:
> /usr/bin/traffic_server(_ZN17CacheContinuation17tunnelClosedEventEiPv+0xf0)[0x606ce0]
> /usr/bin/traffic_server(_ZN12OneWayTunnel10startEventEiPv+0x4a1)[0x5e3f51]
> /usr/bin/traffic_server(_ZN7CacheVC8calluserEi+0x2b)[0x65db7b]
> /usr/bin/traffic_server(_ZN7CacheVC12openReadMainEiP5Event+0xceb)[0x65c1ab]
> /usr/bin/traffic_server(_ZN17CacheContinuation18handleDisposeEventEiPS_+0x98)[0x606748]
> /usr/bin/traffic_server(_ZN17CacheContinuation19disposeOfDataBufferEPv+0x41)[0x607421]
> /usr/bin/traffic_server(_ZN14ClusterControl9free_dataEv+0x2b)[0x619d9b]
> /usr/bin/traffic_server(_ZN14ClusterHandler23update_channels_writtenEb+0x208)[0x6130d8]
> /usr/bin/traffic_server[0x6182ea]
> /usr/bin/traffic_server(_ZN14ClusterHandler16mainClusterEventEiP5Event+0x1e4)[0x618874]
> /usr/bin/traffic_server(_ZN12ClusterState10IOCompleteEv+0x8c)[0x61ba5c]
> /usr/bin/traffic_server(_ZN12ClusterState16doIO_write_eventEiPv+0x144)[0x61bcb4]
> /usr/bin/traffic_server[0x687f11]
> /usr/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x68b4e7]
> /usr/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x6826d6]
> /usr/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x6ab654]
> /usr/bin/traffic_server(_ZN7EThread7executeEv+0x4d3)[0x6abff3]
> /usr/bin/traffic_server[0x6aa5f2]
> /lib64/libpthread.so.0[0x3984e077f1]
> /lib64/libc.so.6(clone+0x6d)[0x3984ae570d]
> [Jul 1 11:51:44.889] Server {0x2b2a2b9a7700} NOTE: ---end---
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira