[jira] [Updated] (TS-3285) Seg fault when 100 CONT handling is enabled

2015-01-23 Thread Sudheer Vinukonda (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheer Vinukonda updated TS-3285:
--
Description: 
With 100 CONT handling enabled in our ats5 production hosts, we are seeing the 
below seg fault.

{code}
(gdb) bt
#0  0x00316e432925 in raise (sig=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00316e434105 in abort () at abort.c:92
#2  0x2b6869944458 in ink_die_die_die (retval=1) at ink_error.cc:43
#3  0x2b6869944525 in ink_fatal_va(int, const char *, typedef __va_list_tag 
__va_list_tag *) (return_code=1, 
message_format=0x2b68699518d8 %s:%d: failed assert `%s`, 
ap=0x2b686bb1bf00) at ink_error.cc:65
#4  0x2b68699445ee in ink_fatal (return_code=1, 
message_format=0x2b68699518d8 %s:%d: failed assert `%s`) at ink_error.cc:73
#5  0x2b6869943160 in _ink_assert (expression=0x7a984e buf_index_inout == 
NULL, file=0x7a96e3 MIME.cc, line=2676) at ink_assert.cc:37
#6  0x0068212d in mime_mem_print (src_d=0x2b686bb1c090 HTTP/1.1, 
src_l=8, buf_start=0x0, buf_length=-1811908575, 
buf_index_inout=0x2b686bb1c1bc, buf_chars_to_skip_inout=0x2b686bb1c1b8) at 
MIME.cc:2676
#7  0x00671df3 in http_version_print (version=65537, buf=0x0, 
bufsize=-1811908575, bufindex=0x2b686bb1c1bc, dumpoffset=0x2b686bb1c1b8)
at HTTP.cc:415
#8  0x006724fb in http_hdr_print (heap=0x2b6881019010, 
hdr=0x2b6881019098, buf=0x0, bufsize=-1811908575, bufindex=0x2b686bb1c1bc, 
dumpoffset=0x2b686bb1c1b8) at HTTP.cc:539
#9  0x004f259b in HTTPHdr::print (this=0x2b68ac06f058, buf=0x0, 
bufsize=-1811908575, bufindex=0x2b686bb1c1bc, dumpoffset=0x2b686bb1c1b8)
at ./hdrs/HTTP.h:897
#10 0x005da903 in HttpSM::write_header_into_buffer 
(this=0x2b68ac06e910, h=0x2b68ac06f058, b=0x2f163e0) at HttpSM.cc:5554
#11 0x005e5129 in HttpSM::write_response_header_into_buffer 
(this=0x2b68ac06e910, h=0x2b68ac06f058, b=0x2f163e0) at HttpSM.h:594
#12 0x005dcef2 in HttpSM::setup_server_transfer (this=0x2b68ac06e910) 
at HttpSM.cc:6295
#13 0x005cd336 in HttpSM::handle_api_return (this=0x2b68ac06e910) at 
HttpSM.cc:1554
#14 0x005cd040 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
event=0, data=0x0) at HttpSM.cc:1446
#15 0x005d89b7 in HttpSM::do_api_callout_internal (this=0x2b68ac06e910) 
at HttpSM.cc:4858
#16 0x005dfdec in HttpSM::set_next_state (this=0x2b68ac06e910) at 
HttpSM.cc:7115
#17 0x005df0ec in HttpSM::call_transact_and_set_next_state 
(this=0x2b68ac06e910, f=0) at HttpSM.cc:6900
#18 0x005cd1e3 in HttpSM::handle_api_return (this=0x2b68ac06e910) at 
HttpSM.cc:1514
#19 0x005cd040 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
event=6, data=0x0) at HttpSM.cc:1446
#20 0x005cc7d6 in HttpSM::state_api_callback (this=0x2b68ac06e910, 
event=6, data=0x0) at HttpSM.cc:1264
#21 0x00515bb5 in TSHttpTxnReenable (txnp=0x2b68ac06e910, 
event=TS_EVENT_HTTP_CONTINUE) at InkAPI.cc:5554
#22 0x2b68806f945b in transform_plugin 
(event=TS_EVENT_HTTP_READ_RESPONSE_HDR, edata=0x2b68ac06e910) at gzip.cc:693
#23 0x0050a40c in INKContInternal::handle_event (this=0x2ea2bb0, 
event=60006, edata=0x2b68ac06e910) at InkAPI.cc:1000
#24 0x004f597e in Continuation::handleEvent (this=0x2ea2bb0, 
event=60006, data=0x2b68ac06e910) at ../iocore/eventsystem/I_Continuation.h:146
#25 0x0050ac53 in APIHook::invoke (this=0x2ea3c80, event=60006, 
edata=0x2b68ac06e910) at InkAPI.cc:1219
#26 0x005ccda9 in HttpSM::state_api_callout (this=0x2b68ac06e910, 
event=0, data=0x0) at HttpSM.cc:1371
#27 0x005d89b7 in HttpSM::do_api_callout_internal (this=0x2b68ac06e910) 
at HttpSM.cc:4858
#28 0x005e54fc in HttpSM::do_api_callout (this=0x2b68ac06e910) at 
HttpSM.cc:448
#29 0x005ce277 in HttpSM::state_read_server_response_header 
(this=0x2b68ac06e910, event=100, data=0x2b68a802afc0) at HttpSM.cc:1861
#30 0x005d0582 in HttpSM::main_handler (this=0x2b68ac06e910, event=100, 
data=0x2b68a802afc0) at HttpSM.cc:2507
#31 0x004f597e in Continuation::handleEvent (this=0x2b68ac06e910, 
event=100, data=0x2b68a802afc0) at ../iocore/eventsystem/I_Continuation.h:146
#32 0x00531d7d in PluginVC::process_read_side (this=0x2b68a802aec0, 
other_side_call=true) at PluginVC.cc:671
#33 0x00531612 in PluginVC::process_write_side (this=0x2b68a802b0a8, 
other_side_call=false) at PluginVC.cc:567
#34 0x005303b4 in PluginVC::main_handler (this=0x2b68a802b0a8, event=1, 
data=0x2b68a80644f0) at PluginVC.cc:212
(gdb) f 12
#12 0x005dcef2 in HttpSM::setup_server_transfer (this=0x2b68ac06e910) 
at HttpSM.cc:6295
6295HttpSM.cc: No such file or directory.
in HttpSM.cc
(gdb) info local
__func__ = setup_server_transfer
hdr_size = 7902907
buf = 0x2f163e0
action = TCA_PASSTHRU_DECHUNKED_CONTENT
alloc_index = 6
nbytes = 47727483405024

[jira] [Comment Edited] (TS-3285) Seg fault when 100 CONT handling is enabled

2015-01-23 Thread Sudheer Vinukonda (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14271543#comment-14271543
 ] 

Sudheer Vinukonda edited comment on TS-3285 at 1/23/15 4:59 PM:


Further debugging showed that 100 cont implementation calls {{do_io_write()}} 
with a reader on MIOBuffer {{ua_entry-write_buffer}}. The point where this 
buffer {{ua_entry-write_buffer}} is freed seems to be 
{{VC_EVENT_READ_COMPLETE/HTTP_TUNNEL_EVENT_PRECOMPLETE}} for the POST data being
read from the client, instead of a WRITE_COMPLETE event for the 100 cont's 
{{do_io_write()}} operation. This could result in premature free'ing of the 
buffer, while the WRITE is not complete yet. 

Note how do_io_write/write_to_net (that still has a reference to the _writer of 
the 100 cont's MIOBuffer), internally may end up allocating iobuf via 
write_avail(). This piece of code (which could get executed after the 100 cont 
buffer is free'd in read_complete event above) could result in accessing the 
MIOBuffer after it's freed (and is on the free list).

{code}
/home/y/bin/traffic_server(new_IOBufferData_internal(char const*, long,
AllocType)+0x51)[0x4f5bef]
/home/y/bin/traffic_server(IOBufferBlock::alloc(long)+0x2c)[0x4f5eec]
/home/y/bin/traffic_server(MIOBuffer::append_block(long)+0x3a)[0x51dee2]
/home/y/bin/traffic_server(MIOBuffer::add_block()+0x22)[0x51df1a]
/home/y/bin/traffic_server(MIOBuffer::check_add_block()+0x4b)[0x6cbd0b]
/home/y/bin/traffic_server(MIOBuffer::write_avail()+0x18)[0x6cbd86]
/home/y/bin/traffic_server(write_to_net_io(NetHandler*, UnixNetVConnection*,
EThread*)+0x397)[0x736175]
/home/y/bin/traffic_server(write_to_net(NetHandler*, UnixNetVConnection*,
EThread*)+0x80)[0x735dd7]
/home/y/bin/traffic_server(NetHandler::mainNetEvent(int,
Event*)+0x654)[0x72f5e2]
/home/y/bin/traffic_server(Continuation::handleEvent(int,
void*)+0x6c)[0x4f5ad8]
/home/y/bin/traffic_server(EThread::process_event(Event*, int)+0xc8)[0x756046]
/home/y/bin/traffic_server(EThread::execute()+0x3dc)[0x756550]
/home/y/bin/traffic_server[0x7555c4]
/lib64/libpthread.so.0(+0x3036c079d1)[0x2aeb5b7e89d1]
/lib64/libc.so.6(clone+0x6d)[0x30364e8b6d]
{code}


was (Author: sudheerv):
Further debugging showed that 100 cont implementation calls do_io_write() with 
a reader on MIOBuffer (ua_entry-write_buffer). The point where this buffer 
(ua_entry-write_buffer) is freed seems to be 
VC_EVENT_READ_COMPLETE/HTTP_TUNNEL_EVENT_PRECOMPLETE for the POST data being
read from the client, instead of a WRITE_COMPLETE event for the 100 cont's 
do_io_write() operation. This could result in premature free'ing of the buffer, 
while the WRITE is not complete yet. 

Note how do_io_write/write_to_net (that still has a reference to the _writer of 
the 100 cont's MIOBuffer), internally may end up allocating iobuf via 
write_avail(). This piece of code (which could get executed after the 100 cont 
buffer is free'd in read_complete event above) could result in accessing the 
MIOBuffer after it's freed (and is on the free list).

{code}
/home/y/bin/traffic_server(new_IOBufferData_internal(char const*, long,
AllocType)+0x51)[0x4f5bef]
/home/y/bin/traffic_server(IOBufferBlock::alloc(long)+0x2c)[0x4f5eec]
/home/y/bin/traffic_server(MIOBuffer::append_block(long)+0x3a)[0x51dee2]
/home/y/bin/traffic_server(MIOBuffer::add_block()+0x22)[0x51df1a]
/home/y/bin/traffic_server(MIOBuffer::check_add_block()+0x4b)[0x6cbd0b]
/home/y/bin/traffic_server(MIOBuffer::write_avail()+0x18)[0x6cbd86]
/home/y/bin/traffic_server(write_to_net_io(NetHandler*, UnixNetVConnection*,
EThread*)+0x397)[0x736175]
/home/y/bin/traffic_server(write_to_net(NetHandler*, UnixNetVConnection*,
EThread*)+0x80)[0x735dd7]
/home/y/bin/traffic_server(NetHandler::mainNetEvent(int,
Event*)+0x654)[0x72f5e2]
/home/y/bin/traffic_server(Continuation::handleEvent(int,
void*)+0x6c)[0x4f5ad8]
/home/y/bin/traffic_server(EThread::process_event(Event*, int)+0xc8)[0x756046]
/home/y/bin/traffic_server(EThread::execute()+0x3dc)[0x756550]
/home/y/bin/traffic_server[0x7555c4]
/lib64/libpthread.so.0(+0x3036c079d1)[0x2aeb5b7e89d1]
/lib64/libc.so.6(clone+0x6d)[0x30364e8b6d]
{code}

 Seg fault when 100 CONT handling is enabled
 ---

 Key: TS-3285
 URL: https://issues.apache.org/jira/browse/TS-3285
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Affects Versions: 5.0.1
Reporter: Sudheer Vinukonda
Assignee: Sudheer Vinukonda
 Fix For: 5.3.0


 With 100 CONT handling enabled in our ats5 production hosts, we are seeing 
 the below seg fault.
 {code}
 (gdb) bt
 #0  0x00316e432925 in raise (sig=6) at 
 ../nptl/sysdeps/unix/sysv/linux/raise.c:64
 #1  0x00316e434105 in abort () at abort.c:92
 #2  0x2b6869944458 in ink_die_die_die (retval=1) at ink_error.cc:43
 #3  

[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Susan Hinrichs (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289600#comment-14289600
 ] 

Susan Hinrichs commented on TS-2497:


Looking at it some more, I wonder if the solution is just to eliminate the 
reset, at least in the branch without the deallocate_buffer.  Another producer 
will be added to handle the traffic from the server back to the client, but due 
to changes in TS-3190, it is safer to leave the original producer/consumer 
around because the new producer will be started up explicitly.  The original 
logic in many cases would start all producers at once.

I'm having problems getting a case to even call handle_post_failure.  Will look 
back at this later.

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Brian Geffon (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289632#comment-14289632
 ] 

Brian Geffon commented on TS-2497:
--

I'd have to do some reading, I don't really remember to much about this.

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Brian Geffon (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289632#comment-14289632
 ] 

Brian Geffon edited comment on TS-2497 at 1/23/15 6:18 PM:
---

I'd have to do some reading, I don't really remember much about this.


was (Author: briang):
I'd have to do some reading, I don't really remember to much about this.

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3294) 5.3.0 Coverity Fixes

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289488#comment-14289488
 ] 

ASF subversion and git services commented on TS-3294:
-

Commit a2bc1245859c2add6b0468586566c83f630889a5 in trafficserver's branch 
refs/heads/master from [~sudheerv]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=a2bc124 ]

[TS-3294] Add null pointer check

Coverity CID:1021867


 5.3.0 Coverity Fixes
 

 Key: TS-3294
 URL: https://issues.apache.org/jira/browse/TS-3294
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cleanup, Quality
Reporter: Sudheer Vinukonda
Assignee: Sudheer Vinukonda
 Fix For: 5.3.0


 Tracker Jira for 5.3.0 Coverity Fixes (Sudheer Vinukonda)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3294) 5.3.0 Coverity Fixes

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289487#comment-14289487
 ] 

ASF subversion and git services commented on TS-3294:
-

Commit b62ea0c9c80f8dad41a62253bc1fd7b1d97b22e6 in trafficserver's branch 
refs/heads/master from [~sudheerv]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=b62ea0c ]

[TS-3294] Add null pointer check

Coverity CID:1021868


 5.3.0 Coverity Fixes
 

 Key: TS-3294
 URL: https://issues.apache.org/jira/browse/TS-3294
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cleanup, Quality
Reporter: Sudheer Vinukonda
Assignee: Sudheer Vinukonda
 Fix For: 5.3.0


 Tracker Jira for 5.3.0 Coverity Fixes (Sudheer Vinukonda)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3294) 5.3.0 Coverity Fixes

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289489#comment-14289489
 ] 

ASF subversion and git services commented on TS-3294:
-

Commit 1d19318b0d59436d007ccad85dfcbbfa1a722807 in trafficserver's branch 
refs/heads/master from [~sudheerv]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=1d19318 ]

[TS-3294] Add null pointer check

Coverity CID:1021866


 5.3.0 Coverity Fixes
 

 Key: TS-3294
 URL: https://issues.apache.org/jira/browse/TS-3294
 Project: Traffic Server
  Issue Type: Improvement
  Components: Cleanup, Quality
Reporter: Sudheer Vinukonda
Assignee: Sudheer Vinukonda
 Fix For: 5.3.0


 Tracker Jira for 5.3.0 Coverity Fixes (Sudheer Vinukonda)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : tsqa-master #41

2015-01-23 Thread jenkins
See https://ci.trafficserver.apache.org/job/tsqa-master/41/



[jira] [Commented] (TS-3287) Coverity fixes for v5.3.0 by zwoop

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289759#comment-14289759
 ] 

ASF subversion and git services commented on TS-3287:
-

Commit fc559c126d302f6928046ec764854cadb540eedf in trafficserver's branch 
refs/heads/master from [~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=fc559c1 ]

TS-3287 Eliminate some dead code around random()

Coverity CID #1261573


 Coverity fixes for v5.3.0 by zwoop
 --

 Key: TS-3287
 URL: https://issues.apache.org/jira/browse/TS-3287
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 This is my JIRA for Coverity commits for v5.3.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3287) Coverity fixes for v5.3.0 by zwoop

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289761#comment-14289761
 ] 

ASF subversion and git services commented on TS-3287:
-

Commit 1ccb1ea4c06ad6f563351320607de62e76c860b6 in trafficserver's branch 
refs/heads/master from [~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=1ccb1ea ]

TS-3287 Ignore the warning on random

Coverity CID #1261572


 Coverity fixes for v5.3.0 by zwoop
 --

 Key: TS-3287
 URL: https://issues.apache.org/jira/browse/TS-3287
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 This is my JIRA for Coverity commits for v5.3.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3287) Coverity fixes for v5.3.0 by zwoop

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289757#comment-14289757
 ] 

ASF subversion and git services commented on TS-3287:
-

Commit f7f3055a22f175d9158f8a5ae473482519fcce43 in trafficserver's branch 
refs/heads/master from [~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=f7f3055 ]

TS-3287 Ignore this coverity error

Coverity CID #1261575


 Coverity fixes for v5.3.0 by zwoop
 --

 Key: TS-3287
 URL: https://issues.apache.org/jira/browse/TS-3287
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 This is my JIRA for Coverity commits for v5.3.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289760#comment-14289760
 ] 

ASF subversion and git services commented on TS-3318:
-

Commit 7ec9f0cc37a530a4fcfc1b0d439e153f497c6ae1 in trafficserver's branch 
refs/heads/master from [~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=7ec9f0c ]

Added TS-3318 to CHANGES


 Remove mgmt/web2/WebHttpSession.{cc,h}
 --

 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289758#comment-14289758
 ] 

ASF subversion and git services commented on TS-3318:
-

Commit cb7fc8f9efd67c7286616bcc94bf607035e28693 in trafficserver's branch 
refs/heads/master from [~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=cb7fc8f ]

TS-3318 Remove mgmt/Web2/WebHttpSession.{cc,h}

This also helps fixing Coverity CID #1261573


 Remove mgmt/web2/WebHttpSession.{cc,h}
 --

 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread Leif Hedstrom (JIRA)
Leif Hedstrom created TS-3318:
-

 Summary: Remove mgmt/web2/WebHttpSession.{cc,h}
 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom


It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread Leif Hedstrom (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom reassigned TS-3318:
-

Assignee: Leif Hedstrom

 Remove mgmt/web2/WebHttpSession.{cc,h}
 --

 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread Leif Hedstrom (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom resolved TS-3318.
---
Resolution: Fixed

 Remove mgmt/web2/WebHttpSession.{cc,h}
 --

 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3318) Remove mgmt/web2/WebHttpSession.{cc,h}

2015-01-23 Thread Leif Hedstrom (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-3318:
--
Fix Version/s: 5.3.0

 Remove mgmt/web2/WebHttpSession.{cc,h}
 --

 Key: TS-3318
 URL: https://issues.apache.org/jira/browse/TS-3318
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 5.3.0


 It is unused, and causes some Coverity errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3315) Assert after try lock

2015-01-23 Thread Alan M. Carroll (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289937#comment-14289937
 ] 

Alan M. Carroll commented on TS-3315:
-

That's not really the issue. If the code must have the lock, it should use 
{{MUTEX_LOCK}} not {{MUTEX_TRY_LOCK}}. If using the latter it should check and 
handle the unlocked case, not assert on failure. If we really need to check, 
why not directly check the lock to see if it is (1) locked and (2) the locking 
thread is this thread rather than trying to lock? E.g.

{code}
ink_assert(NULL == cont-mutex || cont-mutex-thread_holding == this_ethread())
{code}

 Assert after try lock
 -

 Key: TS-3315
 URL: https://issues.apache.org/jira/browse/TS-3315
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Phil Sorber

 In iocore/cache/Cache.cc there is the following:
 {code}
   CACHE_TRY_LOCK(lock, cont-mutex, this_ethread());
   ink_assert(lock.is_locked());
 {code}
 Does it really make sense to try and assert when a try can fail?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Feifei Cai (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289206#comment-14289206
 ] 

Feifei Cai edited comment on TS-2497 at 1/23/15 12:53 PM:
--

Memory leak is noticed in our production hosts. It should be related to 
handling 5xx response from origin sever.

The dump info is as follows, it's from 1 host with ~70% POST requests. I 
enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track 
{{proxy.config.res_track_memory}}.

*traffic.out:*
{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
  0 |  0 |2097152 | 
memory/ioBufAllocator[14]
  0 |  0 |1048576 | 
memory/ioBufAllocator[13]
  0 |  0 | 524288 | 
memory/ioBufAllocator[12]
  0 |  0 | 262144 | 
memory/ioBufAllocator[11]
  0 |  0 | 131072 | 
memory/ioBufAllocator[10]
  0 |  0 |  65536 | memory/ioBufAllocator[9]
 1266679808 | 1262354432 |  32768 | memory/ioBufAllocator[8]
  600309760 |  599703552 |  16384 | memory/ioBufAllocator[7]
  395051008 |  391086080 |   8192 | memory/ioBufAllocator[6]
  229113856 |  224432128 |   4096 | memory/ioBufAllocator[5]
  342622208 |  342503424 |   2048 | memory/ioBufAllocator[4]
  245104640 |  245042176 |   1024 | memory/ioBufAllocator[3]
2228224 |2176512 |512 | memory/ioBufAllocator[2]
 622592 | 607232 |256 | memory/ioBufAllocator[1]
2375680 |2370176 |128 | memory/ioBufAllocator[0]
  Location |  Size In-use
---+
  memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896
  memory/IOBuffer/HttpClientSession.cc:230 |0
memory/IOBuffer/HttpSM.cc:3314 |0
memory/IOBuffer/HttpSM.cc:5349 |   3003506816
memory/IOBuffer/HttpSM.cc:5668 |0
memory/IOBuffer/HttpSM.cc:5874 |0
memory/IOBuffer/HttpSM.cc:5976 |0
memory/IOBuffer/HttpSM.cc:6267 |0
   memory/IOBuffer/HttpServerSession.cc:87 |0
  memory/IOBuffer/HttpTunnel.cc:95 |0
 memory/IOBuffer/HttpTunnel.cc:100 |0
 TOTAL |   3070275712
{noformat}


I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce 
the memory leak in my local environment.

# 
[origin-server.js|https://issues.apache.org/jira/secure/attachment/12694146/client.js]
This origin server responses a 503 when receives more than one single byte, so 
the post would not complete at most cases. I change [~shaunmcginnity]'s code, 
make origin server responses to ats, which would make ats hits another code 
path.
# 
[client.js|https://issues.apache.org/jira/secure/attachment/12694145/origin-server.js]
We create a new client per second, and each client try to post 32K bytes data.
# ats
*remap.config*: remap all to local port 5000
{quote}map / http://127.0.0.1:5000{quote}
*records.config*: listen on 80
{quote}CONFIG proxy.config.http.server_ports STRING 80{quote}

Then we can get dump info as follows, and in-use number of MIOBuffer with 
index=8 (size=32K) would increase 1 per second.

{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
1048576 |  32768 |  32768 | memory/ioBufAllocator[8]
{noformat}

We can also try change the Content-Length in client.js to a smaller size, and 
MIOBuffer with the corresponding index(0-7) would also increase.


I add this simple patch to prevent the memory leak in the case above, just like 
last commit, and it's verified in 1 test host.

free.diff
{code}
diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 932ef97..123b97a 100644
--- a/proxy/http/HttpSM.cc
+++ b/proxy/http/HttpSM.cc
@@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure()
   t_state.current.server-keep_alive = HTTP_NO_KEEPALIVE;

   if (server_buffer_reader-read_avail()  0) {
+tunnel.deallocate_buffers();
 tunnel.reset();
 // There's data from the server so try to read the header
 

[jira] [Updated] (TS-3319) Adapt to Openssl 1.,0.2 Certificate Callback

2015-01-23 Thread Susan Hinrichs (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs updated TS-3319:
---
Issue Type: Improvement  (was: Bug)

 Adapt to Openssl 1.,0.2 Certificate Callback
 

 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Susan Hinrichs

 With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI 
 callback to pause.
 With openssl 1.0.2 the client certificate callback is extended to work for 
 server certificate selection.  You can return values to pause the SSL 
 processing after the client hello here as well.
 The details are at 
 https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html
 ATS should be extended to use the certificate callback mechanism if openssl 
 1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TS-3319) Adapt to Openssl 1.,0.2 Certificate Callback

2015-01-23 Thread Susan Hinrichs (JIRA)
Susan Hinrichs created TS-3319:
--

 Summary: Adapt to Openssl 1.,0.2 Certificate Callback
 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Bug
Reporter: Susan Hinrichs


With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI callback 
to pause.

With openssl 1.0.2 the client certificate callback is extended to work for 
server certificate selection.  You can return values to pause the SSL 
processing after the client hello here as well.

The details are at 
https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html

ATS should be extended to use the certificate callback mechanism if openssl 
1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TS-3319) Adapt to Openssl 1.,0.2 Certificate Callback

2015-01-23 Thread Susan Hinrichs (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs reassigned TS-3319:
--

Assignee: Susan Hinrichs

 Adapt to Openssl 1.,0.2 Certificate Callback
 

 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Susan Hinrichs
Assignee: Susan Hinrichs

 With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI 
 callback to pause.
 With openssl 1.0.2 the client certificate callback is extended to work for 
 server certificate selection.  You can return values to pause the SSL 
 processing after the client hello here as well.
 The details are at 
 https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html
 ATS should be extended to use the certificate callback mechanism if openssl 
 1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3319) Adapt to Openssl 1.0.2 Certificate Callback

2015-01-23 Thread Leif Hedstrom (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-3319:
--
Fix Version/s: 5.3.0

 Adapt to Openssl 1.0.2 Certificate Callback
 ---

 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Improvement
  Components: SSL
Reporter: Susan Hinrichs
Assignee: Susan Hinrichs
 Fix For: 5.3.0


 With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI 
 callback to pause.
 With openssl 1.0.2 the client certificate callback is extended to work for 
 server certificate selection.  You can return values to pause the SSL 
 processing after the client hello here as well.
 The details are at 
 https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html
 ATS should be extended to use the certificate callback mechanism if openssl 
 1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3319) Adapt to Openssl 1.0.2 Certificate Callback

2015-01-23 Thread Leif Hedstrom (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leif Hedstrom updated TS-3319:
--
Component/s: SSL

 Adapt to Openssl 1.0.2 Certificate Callback
 ---

 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Improvement
  Components: SSL
Reporter: Susan Hinrichs
Assignee: Susan Hinrichs
 Fix For: 5.3.0


 With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI 
 callback to pause.
 With openssl 1.0.2 the client certificate callback is extended to work for 
 server certificate selection.  You can return values to pause the SSL 
 processing after the client hello here as well.
 The details are at 
 https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html
 ATS should be extended to use the certificate callback mechanism if openssl 
 1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-3319) Adapt to Openssl 1.0.2 Certificate Callback

2015-01-23 Thread Susan Hinrichs (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs updated TS-3319:
---
Summary: Adapt to Openssl 1.0.2 Certificate Callback  (was: Adapt to 
Openssl 1.,0.2 Certificate Callback)

 Adapt to Openssl 1.0.2 Certificate Callback
 ---

 Key: TS-3319
 URL: https://issues.apache.org/jira/browse/TS-3319
 Project: Traffic Server
  Issue Type: Improvement
Reporter: Susan Hinrichs
Assignee: Susan Hinrichs

 With TS-3006, we provided a patch for openssl 1.0.1 to enable the SNI 
 callback to pause.
 With openssl 1.0.2 the client certificate callback is extended to work for 
 server certificate selection.  You can return values to pause the SSL 
 processing after the client hello here as well.
 The details are at 
 https://www.openssl.org/docs/ssl/SSL_CTX_set_cert_cb.html
 ATS should be extended to use the certificate callback mechanism if openssl 
 1.0.2 is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: tsqa-master #40

2015-01-23 Thread jenkins
See https://ci.trafficserver.apache.org/job/tsqa-master/40/

--
Started by timer
Building remotely on QA1 (qa) in workspace 
https://ci.trafficserver.apache.org/job/tsqa-master/ws/
  /usr/bin/git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
  /usr/bin/git config remote.origin.url git://git.apache.org/trafficserver.git 
  # timeout=10
Cleaning workspace
  /usr/bin/git rev-parse --verify HEAD # timeout=10
Resetting working tree
  /usr/bin/git reset --hard # timeout=10
ERROR: Error fetching remote repo 'origin'
ERROR: Error fetching remote repo 'origin'



[jira] [Commented] (TS-3315) Assert after try lock

2015-01-23 Thread taorui (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289020#comment-14289020
 ] 

taorui commented on TS-3315:


it`s great if you document it.
if the caller has interest of the result of remove (cont != NULL), then the 
current thread must hold the lock of cont-mutex
if not (cont == NULL), the try should lock.





 Assert after try lock
 -

 Key: TS-3315
 URL: https://issues.apache.org/jira/browse/TS-3315
 Project: Traffic Server
  Issue Type: Bug
  Components: Cache
Reporter: Phil Sorber

 In iocore/cache/Cache.cc there is the following:
 {code}
   CACHE_TRY_LOCK(lock, cont-mutex, this_ethread());
   ink_assert(lock.is_locked());
 {code}
 Does it really make sense to try and assert when a try can fail?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3235) PluginVC crashed with unrecognized event

2015-01-23 Thread zouyu (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289055#comment-14289055
 ] 

zouyu commented on TS-3235:
---

[~briang], [~amc],[~portl4t]
After checking the lib/atscppapi/src/InterceptPlugin.cc, there are some 
problems.
1. InterceptPlugin uses atscppapi::Mutex which is separate from ProxyMutex 
which is used by ATS threads.
2. InterceptPlugin uses TSMutexCreate to create a mutex to sync continuation in 
InterceptPlugin::InterceptPlugin, but it doesn't use it to sync when calling 
'InterceptPlugin::produce'  'InterceptPlugin::setOutputComplete' , instead it 
calls 'getMutex()', which actually calls its parent class 
'TransactionPlugin::getMutex()', and uses TransactionPlugin::state_-mutex. So, 
it cannot sync the customer threads with ats threads.
3. when calling 'InterceptPlugin::handleEvent' function, it locks 
'plugin_handle-mutex_' which is also 'TransactionPlugin::state_-mutex'. So, 
it cannot sync the customer threads with ats threads.

Seems that we need to enhance InterceptPlugin to use the correct mutex, i think 
we should replace all above mutex to the one which is set into the continuation.

 PluginVC crashed with unrecognized event
 

 Key: TS-3235
 URL: https://issues.apache.org/jira/browse/TS-3235
 Project: Traffic Server
  Issue Type: Bug
  Components: CPP API, HTTP, Plugins
Reporter: kang li
Assignee: Brian Geffon
 Fix For: 5.3.0

 Attachments: pluginvc-crash.diff


 We are using atscppapi to create Intercept plugin.
  
 From the coredump , that seems Continuation of the InterceptPlugin was 
 already been destroyed. 
 {code}
 #0  0x00375ac32925 in raise () from /lib64/libc.so.6
 #1  0x00375ac34105 in abort () from /lib64/libc.so.6
 #2  0x2b21eeae3458 in ink_die_die_die (retval=1) at ink_error.cc:43
 #3  0x2b21eeae3525 in ink_fatal_va(int, const char *, typedef 
 __va_list_tag __va_list_tag *) (return_code=1, 
 message_format=0x2b21eeaf08d8 %s:%d: failed assert `%s`, 
 ap=0x2b21f4913ad0) at ink_error.cc:65
 #4  0x2b21eeae35ee in ink_fatal (return_code=1, 
 message_format=0x2b21eeaf08d8 %s:%d: failed assert `%s`) at ink_error.cc:73
 #5  0x2b21eeae2160 in _ink_assert (expression=0x76ddb8 call_event == 
 core_lock_retry_event, file=0x76dd04 PluginVC.cc, line=203)
 at ink_assert.cc:37
 #6  0x00530217 in PluginVC::main_handler (this=0x2b24ef007cb8, 
 event=1, data=0xe0f5b80) at PluginVC.cc:203
 #7  0x004f5854 in Continuation::handleEvent (this=0x2b24ef007cb8, 
 event=1, data=0xe0f5b80) at ../iocore/eventsystem/I_Continuation.h:146
 #8  0x00755d26 in EThread::process_event (this=0x309b250, 
 e=0xe0f5b80, calling_code=1) at UnixEThread.cc:145
 #9  0x0075610a in EThread::execute (this=0x309b250) at 
 UnixEThread.cc:239
 #10 0x00755284 in spawn_thread_internal (a=0x2849330) at Thread.cc:88
 #11 0x2b21ef05f9d1 in start_thread () from /lib64/libpthread.so.0
 #12 0x00375ace8b7d in clone () from /lib64/libc.so.6
 (gdb) p sm_lock_retry_event
 $13 = (Event *) 0x2b2496146e90
 (gdb) p core_lock_retry_event
 $14 = (Event *) 0x0
 (gdb) p active_event
 $15 = (Event *) 0x0
 (gdb) p inactive_event
 $16 = (Event *) 0x0
 (gdb) p *(INKContInternal*)this-core_obj-connect_to
 Cannot access memory at address 0x2b269cd46c10
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3243) Warnings from loading legitimate TLS certificates

2015-01-23 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289300#comment-14289300
 ] 

ASF subversion and git services commented on TS-3243:
-

Commit 4f043934c5d8e56e2ea6fa8f88badb7345e37d1c in trafficserver's branch 
refs/heads/master from shinrich
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=4f04393 ]

TS-3243: Remove warnings while loading certificates with duplicate names.


 Warnings from loading legitimate TLS certificates
 -

 Key: TS-3243
 URL: https://issues.apache.org/jira/browse/TS-3243
 Project: Traffic Server
  Issue Type: Bug
  Components: SSL
Reporter: Leif Hedstrom
Assignee: Susan Hinrichs
 Fix For: 5.3.0


 When loading a legitimate certificate (from Go Daddy), which has a domain 
 name of trafficserver.apache.org as well as some SNs which includes 
 trafficserver.apache.org as well, we get these warnings:
 {code}
 [Dec 17 16:01:19.540] Server {0x2b58fdcadf40} NOTE: loading SSL certificate 
 configuration from /usr/local/etc/trafficserver/ssl_multicert.config
 [Dec 17 16:01:19.545] Server {0x2b58fdcadf40} WARNING: previously indexed 
 'trafficserver.apache.org' with SSL_CTX 0x1, cannot index it with SSL_CTX #2 
 now
 {code}
 I've looked at a couple certs from GD, and this practice seems normal. I 
 don't think we should warn on this case, if the domain name for the cert is 
 duplicated in the SN, just ignore the latter right ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-3243) Warnings from loading legitimate TLS certificates

2015-01-23 Thread Susan Hinrichs (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289301#comment-14289301
 ] 

Susan Hinrichs commented on TS-3243:


This fixes the GD case.  If the subject name is repeated in the SAN, we do not 
warn.  Not sure if this addresses the case Dave is seeing.  Let me know if this 
does not address the wildcard repeat.

 Warnings from loading legitimate TLS certificates
 -

 Key: TS-3243
 URL: https://issues.apache.org/jira/browse/TS-3243
 Project: Traffic Server
  Issue Type: Bug
  Components: SSL
Reporter: Leif Hedstrom
Assignee: Susan Hinrichs
 Fix For: 5.3.0


 When loading a legitimate certificate (from Go Daddy), which has a domain 
 name of trafficserver.apache.org as well as some SNs which includes 
 trafficserver.apache.org as well, we get these warnings:
 {code}
 [Dec 17 16:01:19.540] Server {0x2b58fdcadf40} NOTE: loading SSL certificate 
 configuration from /usr/local/etc/trafficserver/ssl_multicert.config
 [Dec 17 16:01:19.545] Server {0x2b58fdcadf40} WARNING: previously indexed 
 'trafficserver.apache.org' with SSL_CTX 0x1, cannot index it with SSL_CTX #2 
 now
 {code}
 I've looked at a couple certs from GD, and this practice seems normal. I 
 don't think we should warn on this case, if the domain name for the cert is 
 duplicated in the SN, just ignore the latter right ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (TS-3243) Warnings from loading legitimate TLS certificates

2015-01-23 Thread Susan Hinrichs (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Susan Hinrichs resolved TS-3243.

Resolution: Fixed

 Warnings from loading legitimate TLS certificates
 -

 Key: TS-3243
 URL: https://issues.apache.org/jira/browse/TS-3243
 Project: Traffic Server
  Issue Type: Bug
  Components: SSL
Reporter: Leif Hedstrom
Assignee: Susan Hinrichs
 Fix For: 5.3.0


 When loading a legitimate certificate (from Go Daddy), which has a domain 
 name of trafficserver.apache.org as well as some SNs which includes 
 trafficserver.apache.org as well, we get these warnings:
 {code}
 [Dec 17 16:01:19.540] Server {0x2b58fdcadf40} NOTE: loading SSL certificate 
 configuration from /usr/local/etc/trafficserver/ssl_multicert.config
 [Dec 17 16:01:19.545] Server {0x2b58fdcadf40} WARNING: previously indexed 
 'trafficserver.apache.org' with SSL_CTX 0x1, cannot index it with SSL_CTX #2 
 now
 {code}
 I've looked at a couple certs from GD, and this practice seems normal. I 
 don't think we should warn on this case, if the domain name for the cert is 
 duplicated in the SN, just ignore the latter right ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Sudheer Vinukonda (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289413#comment-14289413
 ] 

Sudheer Vinukonda commented on TS-2497:
---

[~ffcai]: 

Does the below patch fix the leak as well as address the concerns about not 
breaking TS-2497?


diff --git a/proxy/http/HttpTunnel.cc b/proxy/http/HttpTunnel.cc
index d75b5a1..75df1a5 100644
--- a/proxy/http/HttpTunnel.cc
+++ b/proxy/http/HttpTunnel.cc
@@ -621,6 +621,11 @@ HttpTunnel::add_producer(VConnection * vc,
   if ((p = alloc_producer()) != NULL) {
 p-vc = vc;
 p-nbytes = nbytes_arg;
+if (p-read_buffer) {
+  free_MIOBuffer(p-read_buffer);
+  p-read_buffer = NULL;
+  p-buffer_start = NULL;
+}
 p-buffer_start = reader_start;
 p-read_buffer = reader_start-mbuf;
 p-vc_handler = sm_handler;


 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Sudheer Vinukonda (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289413#comment-14289413
 ] 

Sudheer Vinukonda edited comment on TS-2497 at 1/23/15 3:41 PM:


[~ffcai]: 

Does the below patch fix the leak as well as address the concerns about not 
breaking TS-2497?

{code}
diff --git a/proxy/http/HttpTunnel.cc b/proxy/http/HttpTunnel.cc
index d75b5a1..75df1a5 100644
--- a/proxy/http/HttpTunnel.cc
+++ b/proxy/http/HttpTunnel.cc
@@ -621,6 +621,11 @@ HttpTunnel::add_producer(VConnection * vc,
   if ((p = alloc_producer()) != NULL) {
 p-vc = vc;
 p-nbytes = nbytes_arg;
+if (p-read_buffer) {
+  free_MIOBuffer(p-read_buffer);
+  p-read_buffer = NULL;
+  p-buffer_start = NULL;
+}
 p-buffer_start = reader_start;
 p-read_buffer = reader_start-mbuf;
 p-vc_handler = sm_handler;
{code}



was (Author: sudheerv):
[~ffcai]: 

Does the below patch fix the leak as well as address the concerns about not 
breaking TS-2497?


diff --git a/proxy/http/HttpTunnel.cc b/proxy/http/HttpTunnel.cc
index d75b5a1..75df1a5 100644
--- a/proxy/http/HttpTunnel.cc
+++ b/proxy/http/HttpTunnel.cc
@@ -621,6 +621,11 @@ HttpTunnel::add_producer(VConnection * vc,
   if ((p = alloc_producer()) != NULL) {
 p-vc = vc;
 p-nbytes = nbytes_arg;
+if (p-read_buffer) {
+  free_MIOBuffer(p-read_buffer);
+  p-read_buffer = NULL;
+  p-buffer_start = NULL;
+}
 p-buffer_start = reader_start;
 p-read_buffer = reader_start-mbuf;
 p-vc_handler = sm_handler;


 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Susan Hinrichs (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289436#comment-14289436
 ] 

Susan Hinrichs commented on TS-2497:


I'm a bit unclear on the original problem that was being solved.  Looking at 
the two commits, it appears that the tunnel.deallocate_buffers(); was moved 
from always being called in HttpSM::handle_post_failure to only being called if 
(server_buffer_reader-read_avail() = 0).

But tunnel.reset is called in all cases (regardless of the value of 
server_buffer_reader-read_avail()), so [~ffcai] is seeing a leak in the case 
where server_buffer_reader-read_avail()  0.  But if we add 
tunnel.deallocate_buffers(); then we are in the original case as far as I can 
tell.

Judging from the original stack trace, it looks like there was a lingering read 
or write on the tunnel buffer.  TS-1425 fixed that for the user agent side by 
canceling the read on the ua_session.  Perhaps the real solution here is to 
cancel the read on the server_session?  And then deallocate_buffers for the 
tunnel in all cases.

[~jacksontj] and [~briang] do you still have your notes on reproducing the 
original crash?  Then we could experiment with adding back the 
deallocate_buffer with a read cancel and see if we can safely solve the memory 
leak.



 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Feifei Cai (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feifei Cai updated TS-2497:
---
Attachment: client.js
origin-server.js

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Feifei Cai (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289206#comment-14289206
 ] 

Feifei Cai commented on TS-2497:


Memory leak is noticed in our production hosts. It should be related to 
handling 5xx response from origin sever.

The dump info is as follows, it's from 1 host with ~70% POST requests. I 
enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track 
{{proxy.config.res_track_memory}}.

*traffic.out:*
{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
  0 |  0 |2097152 | 
memory/ioBufAllocator[14]
  0 |  0 |1048576 | 
memory/ioBufAllocator[13]
  0 |  0 | 524288 | 
memory/ioBufAllocator[12]
  0 |  0 | 262144 | 
memory/ioBufAllocator[11]
  0 |  0 | 131072 | 
memory/ioBufAllocator[10]
  0 |  0 |  65536 | memory/ioBufAllocator[9]
 1266679808 | 1262354432 |  32768 | memory/ioBufAllocator[8]
  600309760 |  599703552 |  16384 | memory/ioBufAllocator[7]
  395051008 |  391086080 |   8192 | memory/ioBufAllocator[6]
  229113856 |  224432128 |   4096 | memory/ioBufAllocator[5]
  342622208 |  342503424 |   2048 | memory/ioBufAllocator[4]
  245104640 |  245042176 |   1024 | memory/ioBufAllocator[3]
2228224 |2176512 |512 | memory/ioBufAllocator[2]
 622592 | 607232 |256 | memory/ioBufAllocator[1]
2375680 |2370176 |128 | memory/ioBufAllocator[0]
  Location |  Size In-use
---+
  memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896
  memory/IOBuffer/HttpClientSession.cc:230 |0
memory/IOBuffer/HttpSM.cc:3314 |0
memory/IOBuffer/HttpSM.cc:5349 |   3003506816
memory/IOBuffer/HttpSM.cc:5668 |0
memory/IOBuffer/HttpSM.cc:5874 |0
memory/IOBuffer/HttpSM.cc:5976 |0
memory/IOBuffer/HttpSM.cc:6267 |0
   memory/IOBuffer/HttpServerSession.cc:87 |0
  memory/IOBuffer/HttpTunnel.cc:95 |0
 memory/IOBuffer/HttpTunnel.cc:100 |0
 TOTAL |   3070275712
{noformat}


I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce 
the memory leak in my local environment.

# origin-server.js
This origin server responses a 503 when receives more than one single byte, so 
the post would not complete at most cases. I change [~shaunmcginnity]'s code, 
make origin server responses to ats, which would make ats hits another code 
path.
# client.js
We create a new client per second, and each client try to post 32K bytes data.
# ATS
*remap.config*: remap all to local port 5000
{quote}map / http://127.0.0.1:5000{quote}
*records.config*: listen on 80
{quote}CONFIG proxy.config.http.server_ports STRING 80{quote}

Then we can get dump info as follows, and in-use number of MIOBuffer with 
index=8 (size=32K) would increase 1 per second.

{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
1048576 |  32768 |  32768 | memory/ioBufAllocator[8]
{noformat}

We can also try change the Content-Length in client.js to a smaller size, and 
MIOBuffer with the corresponding index(0-7) would also increase.


I add this simple patch to prevent the memory leak in the case above, just like 
last commit, and it's verified in 1 test host.

free.diff
{code}
diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 932ef97..123b97a 100644
--- a/proxy/http/HttpSM.cc
+++ b/proxy/http/HttpSM.cc
@@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure()
   t_state.current.server-keep_alive = HTTP_NO_KEEPALIVE;

   if (server_buffer_reader-read_avail()  0) {
+tunnel.deallocate_buffers();
 tunnel.reset();
 // There's data from the server so try to read the header
 setup_server_read_response_header();
{code}

*traffic.out*
{noformat}
 allocated  |in-use  | type size  |   free list name

[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Feifei Cai (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289206#comment-14289206
 ] 

Feifei Cai edited comment on TS-2497 at 1/23/15 12:54 PM:
--

Memory leak is noticed in our production hosts. It should be related to 
handling 5xx response from origin sever.

The dump info is as follows, it's from 1 host with ~70% POST requests. I 
enabled memory dump {{proxy.config.dump_mem_info_frequency}} and track 
{{proxy.config.res_track_memory}}.

*traffic.out:*
{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
  0 |  0 |2097152 | 
memory/ioBufAllocator[14]
  0 |  0 |1048576 | 
memory/ioBufAllocator[13]
  0 |  0 | 524288 | 
memory/ioBufAllocator[12]
  0 |  0 | 262144 | 
memory/ioBufAllocator[11]
  0 |  0 | 131072 | 
memory/ioBufAllocator[10]
  0 |  0 |  65536 | memory/ioBufAllocator[9]
 1266679808 | 1262354432 |  32768 | memory/ioBufAllocator[8]
  600309760 |  599703552 |  16384 | memory/ioBufAllocator[7]
  395051008 |  391086080 |   8192 | memory/ioBufAllocator[6]
  229113856 |  224432128 |   4096 | memory/ioBufAllocator[5]
  342622208 |  342503424 |   2048 | memory/ioBufAllocator[4]
  245104640 |  245042176 |   1024 | memory/ioBufAllocator[3]
2228224 |2176512 |512 | memory/ioBufAllocator[2]
 622592 | 607232 |256 | memory/ioBufAllocator[1]
2375680 |2370176 |128 | memory/ioBufAllocator[0]
  Location |  Size In-use
---+
  memory/IOBuffer/ProtocolProbeSessionAccept.cc:39 | 66768896
  memory/IOBuffer/HttpClientSession.cc:230 |0
memory/IOBuffer/HttpSM.cc:3314 |0
memory/IOBuffer/HttpSM.cc:5349 |   3003506816
memory/IOBuffer/HttpSM.cc:5668 |0
memory/IOBuffer/HttpSM.cc:5874 |0
memory/IOBuffer/HttpSM.cc:5976 |0
memory/IOBuffer/HttpSM.cc:6267 |0
   memory/IOBuffer/HttpServerSession.cc:87 |0
  memory/IOBuffer/HttpTunnel.cc:95 |0
 memory/IOBuffer/HttpTunnel.cc:100 |0
 TOTAL |   3070275712
{noformat}


I take a refer to [~shaunmcginnity]'s node.js with some changes, and reproduce 
the memory leak in my local environment.

# 
[origin-server.js|https://issues.apache.org/jira/secure/attachment/12694145/origin-server.js]
This origin server responses a 503 when receives more than one single byte, so 
the post would not complete at most cases. I change [~shaunmcginnity]'s code, 
make origin server responses to ats, which would make ats hits another code 
path.
# 
[client.js|https://issues.apache.org/jira/secure/attachment/12694146/client.js]
We create a new client per second, and each client try to post 32K bytes data.
# ats
*remap.config*: remap all to local port 5000
{quote}map / http://127.0.0.1:5000{quote}
*records.config*: listen on 80
{quote}CONFIG proxy.config.http.server_ports STRING 80{quote}

Then we can get dump info as follows, and in-use number of MIOBuffer with 
index=8 (size=32K) would increase 1 per second.

{noformat}
 allocated  |in-use  | type size  |   free list name
|||--
1048576 |  32768 |  32768 | memory/ioBufAllocator[8]
{noformat}

We can also try change the Content-Length in client.js to a smaller size, and 
MIOBuffer with the corresponding index(0-7) would also increase.


I add this simple patch to prevent the memory leak in the case above, just like 
last commit, and it's verified in 1 test host.

free.diff
{code}
diff --git a/proxy/http/HttpSM.cc b/proxy/http/HttpSM.cc
index 932ef97..123b97a 100644
--- a/proxy/http/HttpSM.cc
+++ b/proxy/http/HttpSM.cc
@@ -5074,6 +5074,7 @@ HttpSM::handle_post_failure()
   t_state.current.server-keep_alive = HTTP_NO_KEEPALIVE;

   if (server_buffer_reader-read_avail()  0) {
+tunnel.deallocate_buffers();
 tunnel.reset();
 // There's data from the server so try to read the header
 

[jira] [Commented] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Sudheer Vinukonda (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289453#comment-14289453
 ] 

Sudheer Vinukonda commented on TS-2497:
---

I wonder if the issue that was originally fixed by [~briang] is similar to the 
issue resolved in TS-3285 (freeing the MIOBuffer while there's a write/read in 
progress, which could eventually corrupt the buffer on the free list).

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TS-2497) Failed post results in tunnel buffers being returned to freelist prematurely

2015-01-23 Thread Sudheer Vinukonda (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289453#comment-14289453
 ] 

Sudheer Vinukonda edited comment on TS-2497 at 1/23/15 4:17 PM:


I wonder if the issue that was originally fixed by [~briang] is similar to the 
issue resolved in TS-3285 (freeing the MIOBuffer while there's a write/read in 
progress, which could eventually corrupt the buffer on the free list). Also 
refer TS-3286 for some proposed improvements to detect buffer corruptions 
sooner/easier.


was (Author: sudheerv):
I wonder if the issue that was originally fixed by [~briang] is similar to the 
issue resolved in TS-3285 (freeing the MIOBuffer while there's a write/read in 
progress, which could eventually corrupt the buffer on the free list).

 Failed post results in tunnel buffers being returned to freelist prematurely
 

 Key: TS-2497
 URL: https://issues.apache.org/jira/browse/TS-2497
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Brian Geffon
Assignee: Brian Geffon
 Fix For: 4.2.0

 Attachments: TS-2497.patch, client.js, origin-server.js, repro.js


 When a post fails to an origin server either the server died or the server 
 returned a response without reading all of the post data, in either case, TS 
 will destroy buffers too early. This normally does not result in a crash 
 because the MIOBuffers are returned to the freelist and only with sufficient 
 load will the race happen causing a crash. Additionally, even if a crash 
 doesn't happen you might have data corruption across post requests from the 
 buffers being used after being returned to the freelist.
 Thanks to Thomas Jackson for help reproducing and resolving this bug.
 An example stack trace, while we've seen other crashes in write_avail too.
 #0  0x004eff14 in IOBufferBlock::read_avail (this=0x0) at 
 ../iocore/eventsystem/I_IOBuffer.h:362
 #1  0x0050d151 in MIOBuffer::append_block_internal 
 (this=0x2aab38001130, b=0x2aab0c037200) at 
 ../iocore/eventsystem/P_IOBuffer.h:946
 #2  0x0050d39b in MIOBuffer::append_block (this=0x2aab38001130, 
 asize_index=15) at ../iocore/eventsystem/P_IOBuffer.h:986
 #3  0x0050d49b in MIOBuffer::add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:994
 #4  0x0055cee2 in MIOBuffer::check_add_block (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1002
 #5  0x0055d115 in MIOBuffer::write_avail (this=0x2aab38001130) at 
 ../iocore/eventsystem/P_IOBuffer.h:1048
 #6  0x006c18f3 in read_from_net (nh=0x2aaafca0d208, 
 vc=0x2aab1c009140, thread=0x2aaafca0a010) at UnixNetVConnection.cc:234
 #7  0x006c37bf in UnixNetVConnection::net_read_io 
 (this=0x2aab1c009140, nh=0x2aaafca0d208, lthread=0x2aaafca0a010) at 
 UnixNetVConnection.cc:816
 #8  0x006be392 in NetHandler::mainNetEvent (this=0x2aaafca0d208, 
 event=5, e=0x271d8e0) at UnixNet.cc:380
 #9  0x004f05c4 in Continuation::handleEvent (this=0x2aaafca0d208, 
 event=5, data=0x271d8e0) at ../iocore/eventsystem/I_Continuation.h:146
 #10 0x006e361e in EThread::process_event (this=0x2aaafca0a010, 
 e=0x271d8e0, calling_code=5) at UnixEThread.cc:142
 #11 0x006e3b13 in EThread::execute (this=0x2aaafca0a010) at 
 UnixEThread.cc:264
 #12 0x006e290b in spawn_thread_internal (a=0x2716400) at Thread.cc:88
 #13 0x003372c077e1 in start_thread () from /lib64/libpthread.so.0
 #14 0x0033728e68ed in clone () from /lib64/libc.so.6



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)