[
https://issues.apache.org/jira/browse/TS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13144936#comment-13144936
]
weijin commented on TS-857:
---------------------------
httpsm can be callbacked between different threads, when terminating the sm, we
call do_io_close to close the net_vc, but net_vc::do_io_close is not thread
safe, Inactivlity_cop and netHandler can also close net_vc when net_vc::closed
is set without locking the mutex of net_vc. I hope this can also explain TS-934.
I have read amc`s patch for TS-934 carefully recently, he contributed a lot in
the problem solving. I have two questions: 1) should we lock mutex of net_vc
in Inactivlity_cop and netHandler 2) should one thread can close net_vc of a
different thread.
I tend to add some codes in net_vc::do_io_close and net_vc::mainEvent to make
it thread safe:
UnixNetVConnection::do_io_close
{
if (thread != this_ethread()) {
thread->schedule_imm(this, EVENT_VC_TRY_TO_CLOSE);
return;
}
disable_read(this);
disable_write(this);
.....
close_UnixNetVConneciton(this, t);
}
UnixNetVConnection::mainEvent(int event, void *e)
{
if (event == EVENT_VC_TRY_TO_CLOSE) {
do_io_close();
return EVENT_DONE;
}
....
// check the active and inactivity timeout
....
}
> Crash Report: HttpTunnel::chain_abort_all -> HttpServerSession::do_io_close
> -> UnixNetVConnection::do_io_close
> --------------------------------------------------------------------------------------------------------------
>
> Key: TS-857
> URL: https://issues.apache.org/jira/browse/TS-857
> Project: Traffic Server
> Issue Type: Bug
> Components: HTTP, Network
> Affects Versions: 3.1.0
> Environment: in my branch that is something same as 3.0.x
> Reporter: Zhao Yongming
> Assignee: weijin
> Fix For: 3.1.2
>
>
> here is the bt from the crash, some of the information is missing due to we
> have not enable the --enable-debug configure options.
> {code}
> [New process 7532]
> #0 ink_stack_trace_get (stack=<value optimized out>, len=<value optimized
> out>, signalhandler_frame=<value optimized out>)
> at ink_stack_trace.cc:68
> 68 fp = (void **) (*fp);
> (gdb) bt
> #0 ink_stack_trace_get (stack=<value optimized out>, len=<value optimized
> out>, signalhandler_frame=<value optimized out>)
> at ink_stack_trace.cc:68
> #1 0x00002ba641dccef1 in ink_stack_trace_dump (sighandler_frame=<value
> optimized out>) at ink_stack_trace.cc:114
> #2 0x00000000004df020 in signal_handler (sig=<value optimized out>) at
> signals.cc:225
> #3 <signal handler called>
> #4 0x00000000006a1ea9 in UnixNetVConnection::do_io_close (this=0x1cc9bd20,
> alerrno=<value optimized out>)
> at ../../iocore/eventsystem/I_Lock.h:297
> #5 0x000000000051f1d0 in HttpServerSession::do_io_close
> (this=0x2aaab0042c80, alerrno=20600) at HttpServerSession.cc:127
> #6 0x000000000056d1e9 in HttpTunnel::chain_abort_all (this=0x2aabeeffdd70,
> p=0x2aabeeffdf68) at HttpTunnel.cc:1300
> #7 0x00000000005269ca in HttpSM::tunnel_handler_ua (this=0x2aabeeffc070,
> event=104, c=0x2aabeeffdda8) at HttpSM.cc:2987
> #8 0x0000000000571dfc in HttpTunnel::consumer_handler (this=0x2aabeeffdd70,
> event=104, c=0x2aabeeffdda8) at HttpTunnel.cc:1232
> #9 0x0000000000572032 in HttpTunnel::main_handler (this=0x2aabeeffdd70,
> event=1088608784, data=<value optimized out>)
> at HttpTunnel.cc:1456
> #10 0x00000000006a6307 in write_to_net_io (nh=0x2aaaab12d688, vc=0x1cc876e0,
> thread=<value optimized out>)
> at ../../iocore/eventsystem/I_Continuation.h:146
> #11 0x000000000069ce97 in NetHandler::mainNetEvent (this=0x2aaaab12d688,
> event=<value optimized out>, e=0x171c1ed0) at UnixNet.cc:405
> #12 0x00000000006cddaf in EThread::process_event (this=0x2aaaab12c010,
> e=0x171c1ed0, calling_code=5) at I_Continuation.h:146
> #13 0x00000000006ce6bc in EThread::execute (this=0x2aaaab12c010) at
> UnixEThread.cc:262
> #14 0x00000000006cd0ee in spawn_thread_internal (a=0x171b58f0) at Thread.cc:88
> #15 0x0000003c33c064a7 in start_thread () from /lib64/libpthread.so.0
> #16 0x0000003c330d3c2d in clone () from /lib64/libc.so.6
> (gdb) info f
> Stack level 0, frame at 0x40e2b790:
> rip = 0x2ba641dccdf3 in ink_stack_trace_get(void**, int, int)
> (ink_stack_trace.cc:68); saved rip 0x2ba641dccef1
> called by frame at 0x40e2bbe0
> source language c++.
> Arglist at 0x40e2b770, args: stack=<value optimized out>, len=<value
> optimized out>, signalhandler_frame=<value optimized out>
> Locals at 0x40e2b770, Previous frame's sp is 0x40e2b790
> Saved registers:
> rbx at 0x40e2b778, rbp at 0x40e2b780, rip at 0x40e2b788
> (gdb)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira