[ 
https://issues.apache.org/jira/browse/TS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271543#comment-14271543
 ] 

Sudheer Vinukonda edited comment on TS-3285 at 1/23/15 4:59 PM:
----------------------------------------------------------------

Further debugging showed that 100 cont implementation calls {{do_io_write()}} 
with a reader on MIOBuffer {{ua_entry->write_buffer}}. The point where this 
buffer {{ua_entry->write_buffer}} is freed seems to be 
{{VC_EVENT_READ_COMPLETE/HTTP_TUNNEL_EVENT_PRECOMPLETE}} for the POST data being
read from the client, instead of a WRITE_COMPLETE event for the 100 cont's 
{{do_io_write()}} operation. This could result in premature free'ing of the 
buffer, while the WRITE is not complete yet. 

Note how do_io_write/write_to_net (that still has a reference to the _writer of 
the 100 cont's MIOBuffer), internally may end up allocating iobuf via 
write_avail(). This piece of code (which could get executed after the 100 cont 
buffer is free'd in read_complete event above) could result in accessing the 
MIOBuffer after it's freed (and is on the free list).

{code}
/home/y/bin/traffic_server(new_IOBufferData_internal(char const*, long,
AllocType)+0x51)[0x4f5bef]
/home/y/bin/traffic_server(IOBufferBlock::alloc(long)+0x2c)[0x4f5eec]
/home/y/bin/traffic_server(MIOBuffer::append_block(long)+0x3a)[0x51dee2]
/home/y/bin/traffic_server(MIOBuffer::add_block()+0x22)[0x51df1a]
/home/y/bin/traffic_server(MIOBuffer::check_add_block()+0x4b)[0x6cbd0b]
/home/y/bin/traffic_server(MIOBuffer::write_avail()+0x18)[0x6cbd86]
/home/y/bin/traffic_server(write_to_net_io(NetHandler*, UnixNetVConnection*,
EThread*)+0x397)[0x736175]
/home/y/bin/traffic_server(write_to_net(NetHandler*, UnixNetVConnection*,
EThread*)+0x80)[0x735dd7]
/home/y/bin/traffic_server(NetHandler::mainNetEvent(int,
Event*)+0x654)[0x72f5e2]
/home/y/bin/traffic_server(Continuation::handleEvent(int,
void*)+0x6c)[0x4f5ad8]
/home/y/bin/traffic_server(EThread::process_event(Event*, int)+0xc8)[0x756046]
/home/y/bin/traffic_server(EThread::execute()+0x3dc)[0x756550]
/home/y/bin/traffic_server[0x7555c4]
/lib64/libpthread.so.0(+0x3036c079d1)[0x2aeb5b7e89d1]
/lib64/libc.so.6(clone+0x6d)[0x30364e8b6d]
{code}


was (Author: sudheerv):
Further debugging showed that 100 cont implementation calls do_io_write() with 
a reader on MIOBuffer (ua_entry->write_buffer). The point where this buffer 
(ua_entry->write_buffer) is freed seems to be 
VC_EVENT_READ_COMPLETE/HTTP_TUNNEL_EVENT_PRECOMPLETE for the POST data being
read from the client, instead of a WRITE_COMPLETE event for the 100 cont's 
do_io_write() operation. This could result in premature free'ing of the buffer, 
while the WRITE is not complete yet. 

Note how do_io_write/write_to_net (that still has a reference to the _writer of 
the 100 cont's MIOBuffer), internally may end up allocating iobuf via 
write_avail(). This piece of code (which could get executed after the 100 cont 
buffer is free'd in read_complete event above) could result in accessing the 
MIOBuffer after it's freed (and is on the free list).

{code}
/home/y/bin/traffic_server(new_IOBufferData_internal(char const*, long,
AllocType)+0x51)[0x4f5bef]
/home/y/bin/traffic_server(IOBufferBlock::alloc(long)+0x2c)[0x4f5eec]
/home/y/bin/traffic_server(MIOBuffer::append_block(long)+0x3a)[0x51dee2]
/home/y/bin/traffic_server(MIOBuffer::add_block()+0x22)[0x51df1a]
/home/y/bin/traffic_server(MIOBuffer::check_add_block()+0x4b)[0x6cbd0b]
/home/y/bin/traffic_server(MIOBuffer::write_avail()+0x18)[0x6cbd86]
/home/y/bin/traffic_server(write_to_net_io(NetHandler*, UnixNetVConnection*,
EThread*)+0x397)[0x736175]
/home/y/bin/traffic_server(write_to_net(NetHandler*, UnixNetVConnection*,
EThread*)+0x80)[0x735dd7]
/home/y/bin/traffic_server(NetHandler::mainNetEvent(int,
Event*)+0x654)[0x72f5e2]
/home/y/bin/traffic_server(Continuation::handleEvent(int,
void*)+0x6c)[0x4f5ad8]
/home/y/bin/traffic_server(EThread::process_event(Event*, int)+0xc8)[0x756046]
/home/y/bin/traffic_server(EThread::execute()+0x3dc)[0x756550]
/home/y/bin/traffic_server[0x7555c4]
/lib64/libpthread.so.0(+0x3036c079d1)[0x2aeb5b7e89d1]
/lib64/libc.so.6(clone+0x6d)[0x30364e8b6d]
{code}

> Seg fault when 100 CONT handling is enabled
> -------------------------------------------
>
>                 Key: TS-3285
>                 URL: https://issues.apache.org/jira/browse/TS-3285
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 5.0.1
>            Reporter: Sudheer Vinukonda
>            Assignee: Sudheer Vinukonda
>             Fix For: 5.3.0
>
>
> With 100 CONT handling enabled in our ats5 production hosts, we are seeing 
> the below seg fault.
> {code}
> (gdb) bt
> #0  0x000000316e432925 in raise (sig=6) at 
> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
> #1  0x000000316e434105 in abort () at abort.c:92
> #2  0x00002b6869944458 in ink_die_die_die (retval=1) at ink_error.cc:43
> #3  0x00002b6869944525 in ink_fatal_va(int, const char *, typedef 
> __va_list_tag __va_list_tag *) (return_code=1, 
>     message_format=0x2b68699518d8 "%s:%d: failed assert `%s`", 
> ap=0x2b686bb1bf00) at ink_error.cc:65
> #4  0x00002b68699445ee in ink_fatal (return_code=1, 
> message_format=0x2b68699518d8 "%s:%d: failed assert `%s`") at ink_error.cc:73
> #5  0x00002b6869943160 in _ink_assert (expression=0x7a984e "buf_index_inout 
> == NULL", file=0x7a96e3 "MIME.cc", line=2676) at ink_assert.cc:37
> #6  0x000000000068212d in mime_mem_print (src_d=0x2b686bb1c090 "HTTP/1.1", 
> src_l=8, buf_start=0x0, buf_length=-1811908575, 
>     buf_index_inout=0x2b686bb1c1bc, buf_chars_to_skip_inout=0x2b686bb1c1b8) 
> at MIME.cc:2676
> #7  0x0000000000671df3 in http_version_print (version=65537, buf=0x0, 
> bufsize=-1811908575, bufindex=0x2b686bb1c1bc, dumpoffset=0x2b686bb1c1b8)
>     at HTTP.cc:415
> #8  0x00000000006724fb in http_hdr_print (heap=0x2b6881019010, 
> hdr=0x2b6881019098, buf=0x0, bufsize=-1811908575, bufindex=0x2b686bb1c1bc, 
>     dumpoffset=0x2b686bb1c1b8) at HTTP.cc:539
> #9  0x00000000004f259b in HTTPHdr::print (this=0x2b68ac06f058, buf=0x0, 
> bufsize=-1811908575, bufindex=0x2b686bb1c1bc, dumpoffset=0x2b686bb1c1b8)
>     at ./hdrs/HTTP.h:897
> #10 0x00000000005da903 in HttpSM::write_header_into_buffer 
> (this=0x2b68ac06e910, h=0x2b68ac06f058, b=0x2f163e0) at HttpSM.cc:5554
> #11 0x00000000005e5129 in HttpSM::write_response_header_into_buffer 
> (this=0x2b68ac06e910, h=0x2b68ac06f058, b=0x2f163e0) at HttpSM.h:594
> #12 0x00000000005dcef2 in HttpSM::setup_server_transfer (this=0x2b68ac06e910) 
> at HttpSM.cc:6295
> #13 0x00000000005cd336 in HttpSM::handle_api_return (this=0x2b68ac06e910) at 
> HttpSM.cc:1554
> #14 0x00000000005cd040 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
> event=0, data=0x0) at HttpSM.cc:1446
> #15 0x00000000005d89b7 in HttpSM::do_api_callout_internal 
> (this=0x2b68ac06e910) at HttpSM.cc:4858
> #16 0x00000000005dfdec in HttpSM::set_next_state (this=0x2b68ac06e910) at 
> HttpSM.cc:7115
> #17 0x00000000005df0ec in HttpSM::call_transact_and_set_next_state 
> (this=0x2b68ac06e910, f=0) at HttpSM.cc:6900
> #18 0x00000000005cd1e3 in HttpSM::handle_api_return (this=0x2b68ac06e910) at 
> HttpSM.cc:1514
> #19 0x00000000005cd040 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
> event=60000, data=0x0) at HttpSM.cc:1446
> #20 0x00000000005cc7d6 in HttpSM::state_api_callback (this=0x2b68ac06e910, 
> event=60000, data=0x0) at HttpSM.cc:1264
> #21 0x0000000000515bb5 in TSHttpTxnReenable (txnp=0x2b68ac06e910, 
> event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5554
> #22 0x00002b68806f945b in transform_plugin 
> (event=TS_EVENT_HTTP_READ_RESPONSE_HDR, edata=0x2b68ac06e910) at gzip.cc:693
> #23 0x000000000050a40c in INKContInternal::handle_event (this=0x2ea2bb0, 
> event=60006, edata=0x2b68ac06e910) at InkAPI.cc:1000
> #24 0x00000000004f597e in Continuation::handleEvent (this=0x2ea2bb0, 
> event=60006, data=0x2b68ac06e910) at 
> ../iocore/eventsystem/I_Continuation.h:146
> #25 0x000000000050ac53 in APIHook::invoke (this=0x2ea3c80, event=60006, 
> edata=0x2b68ac06e910) at InkAPI.cc:1219
> #26 0x00000000005ccda9 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
> event=0, data=0x0) at HttpSM.cc:1371
> #27 0x00000000005d89b7 in HttpSM::do_api_callout_internal 
> (this=0x2b68ac06e910) at HttpSM.cc:4858
> #28 0x00000000005e54fc in HttpSM::do_api_callout (this=0x2b68ac06e910) at 
> HttpSM.cc:448
> #29 0x00000000005ce277 in HttpSM::state_read_server_response_header 
> (this=0x2b68ac06e910, event=100, data=0x2b68a802afc0) at HttpSM.cc:1861
> #30 0x00000000005d0582 in HttpSM::main_handler (this=0x2b68ac06e910, 
> event=100, data=0x2b68a802afc0) at HttpSM.cc:2507
> #31 0x00000000004f597e in Continuation::handleEvent (this=0x2b68ac06e910, 
> event=100, data=0x2b68a802afc0) at ../iocore/eventsystem/I_Continuation.h:146
> #32 0x0000000000531d7d in PluginVC::process_read_side (this=0x2b68a802aec0, 
> other_side_call=true) at PluginVC.cc:671
> #33 0x0000000000531612 in PluginVC::process_write_side (this=0x2b68a802b0a8, 
> other_side_call=false) at PluginVC.cc:567
> #34 0x00000000005303b4 in PluginVC::main_handler (this=0x2b68a802b0a8, 
> event=1, data=0x2b68a80644f0) at PluginVC.cc:212
> (gdb) f 12
> #12 0x00000000005dcef2 in HttpSM::setup_server_transfer (this=0x2b68ac06e910) 
> at HttpSM.cc:6295
> 6295  HttpSM.cc: No such file or directory.
>       in HttpSM.cc
> (gdb) info local
> __func__ = "setup_server_transfer"
> hdr_size = 7902907
> buf = 0x2f163e0
> action = TCA_PASSTHRU_DECHUNKED_CONTENT
> alloc_index = 6
> nbytes = 47727483405024
> buf_start = 0x2f163f8
> p = 0x2b6888067680
> (gdb) p *buf
> $15 = {size_index = 6, water_mark = 32768, _writer = {m_ptr = 
> 0x2b68a402eda0}, readers = {{accessor = 0x0, mbuf = 0x2f163e0, block = {
>         m_ptr = 0x2b68a402eda0}, start_offset = 0, size_limit = 
> 9223372036854775807}, {accessor = 0x0, mbuf = 0x0, block = {m_ptr = 0x0}, 
>       start_offset = 0, size_limit = 9223372036854775807}, {accessor = 0x0, 
> mbuf = 0x0, block = {m_ptr = 0x0}, start_offset = 0, 
>       size_limit = 9223372036854775807}, {accessor = 0x0, mbuf = 0x0, block = 
> {m_ptr = 0x0}, start_offset = 0, size_limit = 9223372036854775807}, {
>       accessor = 0x0, mbuf = 0x0, block = {m_ptr = 0x0}, start_offset = 0, 
> size_limit = 9223372036854775807}}, 
>   _location = 0x78c5a0 "memory/IOBuffer/HttpSM.cc:6267"}
> (gdb) p *buf->_writer->m_ptr
> $16 = {<RefCountObj> = {<ForceVFPTToTop> = {_vptr.ForceVFPTToTop = 0x761690}, 
> m_refcount = 2}, _start = 0x0, _end = 0x0, 
>   _buf_end = 0x2b6894007821 "\345v", _location = 0x78aba8 
> "memory/IOBuffer/HttpSM.cc:1909", data = {m_ptr = 0x2b68a8081650}, next = {
>     m_ptr = 0x2b6890030290}}
> (gdb) p *buf->_writer->m_ptr->next->m_ptr
> $17 = {<RefCountObj> = {<ForceVFPTToTop> = {_vptr.ForceVFPTToTop = 0x761690}, 
> m_refcount = 1}, _start = 0x2b68a013c000 "", _end = 0x2b68a013c000 "", 
>   _buf_end = 0x2b68a013d000 
> "jI0NiZhZFR5cGU9dGFnWDQ4ODgzMS0xMDEtbWJsMS1TMDImdUlQPTUuNjguNjQuMjAmYWRTaXRlTGluaz0xJmFkRmVlZElEPSUyZmQlMmZzZWFyY2glMmZwJTJmdXMlMmZwbWc2JTJmcmxiJTJmeG1sJTJmJmFkWU1rdD11ayZ1YT1Nb3ppbGxhJTI1MkY1LjAlMjUyM"...,
>  _location = 0x78c5a0 "memory/IOBuffer/HttpSM.cc:6267", data = {
>     m_ptr = 0x2b68a40a7d80}, next = {m_ptr = 0x0}}
> {code}
> Analyzing further, it seems that the issue is caused by possibly accessing a 
> MIOBuffer that's on the free list (or after it's been free'd).  
> Some relevant details from debugging:
> Confirmed with the help of below piece of test code, that the _writer is 
> non-null, right after calling new_empty_MIOBuffer(). free_MIOBuffer() sets 
> _writer to null, so, the only way, this can be non-null right after 
> new_empty_MIOBuffer() is if the free'd buffer was incorrectly accessed. 
> {code}
> diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
> index 932ef97..bffc411 100644
> --- a/proxy/http/HttpSM.cc
> +++ b/proxy/http/HttpSM.cc
> @@ -6265,6 +6265,10 @@ HttpSM::setup_server_transfer()
>    MIOBuffer *buf = new_MIOBuffer(alloc_index);
>  #else
>    MIOBuffer *buf = new_empty_MIOBuffer(alloc_index);
> +  if (!buf->empty()) {
> +    Error ("buf not empty after new_empty_MIOBuffer, buf %p", buf);
> +    buf->dealloc();
> +  }
>    buf->append_block(HTTP_HEADER_BUFFER_SIZE_INDEX);
>  #endif
>    buf->water_mark = (int) t_state.txn_conf->default_buffer_water_mark;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to