[
https://issues.apache.org/jira/browse/TS-2776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014662#comment-14014662
]
Sudheer Vinukonda edited comment on TS-2776 at 5/31/14 2:01 PM:
----------------------------------------------------------------
Found another core this morning, so, reverting TS-2271 doesn't seem to solve
the crash entirely, although, the frequency seems to be lower (note that, the
core frequency was also lower with just TS-2815 reverted).
May 29 14:52:21 l10 kernel: [ET_SSL 19][1775]: segfault at 3 ip
000000331622b2e7 sp 00002b698e523ab0 error 6 in
libssl.so.1.0.1e[3316200000+61000]
May 30 11:09:14 l10 kernel: gdb[16350]: segfault at 7f7948f027c4 ip
000000000054f75c sp 00007fff19f51790 error 4 in gdb[400000+436000]
May 31 07:48:23 l10 kernel: [ET_SSL 10][20272]: segfault at 3 ip
000000331622b2e7 sp 00002b8edc402aa0 error 6 in
libssl.so.1.0.1e[3316200000+61000]
>From the gdb output, the IOBufferBlock seems to be out of bounds.
{code}
#0 0x000000331622b2e7 in ?? () from /usr/lib64/libssl.so.10
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64
gmp-4.3.1-7.el6_2.2.x86_64 hwloc-1.5-1.el6.x86_64
keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64
libattr-2.4.44-7.el6.x86_64 libcap-2.16-5.5.el6.x86_64
libcom_err-1.41.12-14.el6.x86_64 libgcc-4.4.7-3.el6.x86_64
libselinux-2.0.94-5.3.el6_4.1.x86_64 libstdc++-4.4.7-3.el6.x86_64
libxml2-2.7.6-12.el6_4.1.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64
numactl-2.0.7-6.el6.x86_64 openssl-1.0.1e-16.el6_5.7.x86_64
pciutils-libs-3.1.10-2.el6.x86_64 pcre-7.8-6.el6.x86_64 tcl-8.5.7-6.el6.x86_64
xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x000000331622b2e7 in ?? () from /usr/lib64/libssl.so.10
#1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
#4 0x000000000070ee48 in write_to_net_io (nh=0x2b8ed6a0dbf0,
vc=0x2b8fff268640, thread=0x2b8ed6a0a010) at UnixNetVConnection.cc:444
#5 0x0000000000704b33 in NetHandler::mainNetEvent (this=0x2b8ed6a0dbf0,
event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:415
#6 0x000000000073141f in handleEvent (this=0x2b8ed6a0a010, e=0x18fc370,
calling_code=5) at I_Continuation.h:146
#7 EThread::process_event (this=0x2b8ed6a0a010, e=0x18fc370, calling_code=5)
at UnixEThread.cc:145
#8 0x0000000000731dc3 in EThread::execute (this=0x2b8ed6a0a010) at
UnixEThread.cc:269
#9 0x00000000007307ca in spawn_thread_internal (a=0x1bcc3c0) at Thread.cc:88
#10 0x00002b8ec1df1851 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003296ee890d in clone () from /lib64/libc.so.6
(gdb) up
#1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
(gdb) up
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
90 SSLNetVConnection.cc: No such file or directory.
in SSLNetVConnection.cc
(gdb) up
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
399 in SSLNetVConnection.cc
(gdb) down
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
90 in SSLNetVConnection.cc
(gdb) print ssl
$1 = (SSL *) 0x2b8fb3b8be70
(gdb)
$2 = (SSL *) 0x2b8fb3b8be70
(gdb) print *ssl
$3 = {version = 771, type = 8192, method = 0x3316464600, rbio = 0x2b8fb28ebf00,
wbio = 0x2b8fb28ebf00, bbio = 0x0, rwstate = 1, in_handshake = 0,
handshake_func = 0x33162217c0 <ssl3_accept>, server = 1, new_session = 0,
quiet_shutdown = 1, shutdown = 0, state = 3, rstate = 240, init_buf = 0x0,
init_msg = 0x2b8fb2263b84, init_num = 0, init_off = 0, packet =
0x2b8fae830a63 "\027\003\003\t0", packet_length = 0, s2 = 0x0, s3 =
0x2b8fb075b4a0, d1 = 0x0,
read_ahead = 0, msg_callback = 0, msg_callback_arg = 0x0, hit = 0, param =
0x2b8fb1c08d50, cipher_list = 0x0, cipher_list_by_id = 0x0, mac_flags = 0,
enc_read_ctx = 0x2b8fb0b043c0, read_hash = 0x2b92452a86c0, expand = 0x0,
enc_write_ctx = 0x2b9245357a90, write_hash = 0x2b8fb113d3b0, compress = 0x0,
cert = 0x2b9244ccf6f0, sid_ctx_length = 0, sid_ctx = '\000' <repeats 31
times>, session = 0x2b8fb381ac20, generate_session_id = 0, verify_mode = 0,
verify_callback = 0,
info_callback = 0, error = 0, error_code = 0, kssl_ctx = 0x2b8fb2ddce10,
psk_client_callback = 0, psk_server_callback = 0, ctx = 0x1badc00, debug = 0,
verify_result = 0,
ex_data = {sk = 0x2b8fb202fdc0, dummy = 0}, client_CA = 0x0, references = 1,
options = 2170227703, mode = 16, max_cert_list = 102400, first_packet = 0,
client_version = 771, max_send_fragment = 16384, tlsext_debug_cb = 0,
tlsext_debug_arg = 0x0, tlsext_hostname = 0x0, servername_done = 1,
tlsext_status_type = -1,
tlsext_status_expected = 0, tlsext_ocsp_ids = 0x0, tlsext_ocsp_exts = 0x0,
tlsext_ocsp_resp = 0x0, tlsext_ocsp_resplen = -1, tlsext_ticket_expected = 1,
tlsext_ecpointformatlist_length = 3, tlsext_ecpointformatlist =
0x2b92455898b0 <Address 0x2b92455898b0 out of bounds>,
tlsext_ellipticcurvelist_length = 0,
tlsext_ellipticcurvelist = 0x0, tlsext_opaque_prf_input = 0x0,
tlsext_opaque_prf_input_len = 0, tlsext_session_ticket = 0x0,
tls_session_ticket_ext_cb = 0,
tls_session_ticket_ext_cb_arg = 0x0, tls_session_secret_cb = 0,
tls_session_secret_cb_arg = 0x0, initial_ctx = 0x1badc00,
next_proto_negotiated = 0x2b8fb2ff04e0 "spdy/3.1\260\323\023\261\217+",
next_proto_negotiated_len = 8 '\b', srtp_profiles = 0x0, srtp_profile = 0x0,
tlsext_heartbeat = 0, tlsext_hb_pending = 0, tlsext_hb_seq = 0, renegotiate =
0}
(gdb) print b
$4 = 0
(gdb) up
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
399 in SSLNetVConnection.cc
(gdb) print b
$5 = (IOBufferBlock *) 0x2b91e4424e40
(gdb) print ssl
$6 = (SSL *) 0x2b8fb3b8be70
(gdb) print *b
Cannot access memory at address 0x2b91e4424e40
(gdb) print b->start()
Cannot evaluate function -- may be inlined
(gdb) print l
$7 = 4096
(gdb) print (IOBufferBlock *)b
$8 = (IOBufferBlock *) 0x2b91e4424e40
(gdb) print *(IOBufferBlock *)b
Cannot access memory at address 0x2b91e4424e40
(gdb) print offset
$9 = 0
{code}
was (Author: sudheerv):
Found another core this morning, so, reverting TS-2271 doesn't seem to solve
the crash entirely, although, the frequency seems to be lower (note that, the
core frequency was also lower with just TS-2815 reverted).
May 29 14:52:21 l10 kernel: [ET_SSL 19][1775]: segfault at 3 ip
000000331622b2e7 sp 00002b698e523ab0 error 6 in
libssl.so.1.0.1e[3316200000+61000]
May 30 11:09:14 l10 kernel: gdb[16350]: segfault at 7f7948f027c4 ip
000000000054f75c sp 00007fff19f51790 error 4 in gdb[400000+436000]
May 31 07:48:23 l10 kernel: [ET_SSL 10][20272]: segfault at 3 ip
000000331622b2e7 sp 00002b8edc402aa0 error 6 in
libssl.so.1.0.1e[3316200000+61000]
{code}
#0 0x000000331622b2e7 in ?? () from /usr/lib64/libssl.so.10
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.107.el6.x86_64
gmp-4.3.1-7.el6_2.2.x86_64 hwloc-1.5-1.el6.x86_64
keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64
libattr-2.4.44-7.el6.x86_64 libcap-2.16-5.5.el6.x86_64
libcom_err-1.41.12-14.el6.x86_64 libgcc-4.4.7-3.el6.x86_64
libselinux-2.0.94-5.3.el6_4.1.x86_64 libstdc++-4.4.7-3.el6.x86_64
libxml2-2.7.6-12.el6_4.1.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64
numactl-2.0.7-6.el6.x86_64 openssl-1.0.1e-16.el6_5.7.x86_64
pciutils-libs-3.1.10-2.el6.x86_64 pcre-7.8-6.el6.x86_64 tcl-8.5.7-6.el6.x86_64
xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x000000331622b2e7 in ?? () from /usr/lib64/libssl.so.10
#1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
#4 0x000000000070ee48 in write_to_net_io (nh=0x2b8ed6a0dbf0,
vc=0x2b8fff268640, thread=0x2b8ed6a0a010) at UnixNetVConnection.cc:444
#5 0x0000000000704b33 in NetHandler::mainNetEvent (this=0x2b8ed6a0dbf0,
event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:415
#6 0x000000000073141f in handleEvent (this=0x2b8ed6a0a010, e=0x18fc370,
calling_code=5) at I_Continuation.h:146
#7 EThread::process_event (this=0x2b8ed6a0a010, e=0x18fc370, calling_code=5)
at UnixEThread.cc:145
#8 0x0000000000731dc3 in EThread::execute (this=0x2b8ed6a0a010) at
UnixEThread.cc:269
#9 0x00000000007307ca in spawn_thread_internal (a=0x1bcc3c0) at Thread.cc:88
#10 0x00002b8ec1df1851 in start_thread () from /lib64/libpthread.so.0
#11 0x0000003296ee890d in clone () from /lib64/libc.so.6
(gdb) up
#1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
(gdb) up
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
90 SSLNetVConnection.cc: No such file or directory.
in SSLNetVConnection.cc
(gdb) up
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
399 in SSLNetVConnection.cc
(gdb) down
#2 0x00000000006fba6f in do_SSL_write (this=0x2b8fff268640, towrite=255944,
wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30, buf=<value optimized
out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:90
90 in SSLNetVConnection.cc
(gdb) print ssl
$1 = (SSL *) 0x2b8fb3b8be70
(gdb)
$2 = (SSL *) 0x2b8fb3b8be70
(gdb) print *ssl
$3 = {version = 771, type = 8192, method = 0x3316464600, rbio = 0x2b8fb28ebf00,
wbio = 0x2b8fb28ebf00, bbio = 0x0, rwstate = 1, in_handshake = 0,
handshake_func = 0x33162217c0 <ssl3_accept>, server = 1, new_session = 0,
quiet_shutdown = 1, shutdown = 0, state = 3, rstate = 240, init_buf = 0x0,
init_msg = 0x2b8fb2263b84, init_num = 0, init_off = 0, packet =
0x2b8fae830a63 "\027\003\003\t0", packet_length = 0, s2 = 0x0, s3 =
0x2b8fb075b4a0, d1 = 0x0,
read_ahead = 0, msg_callback = 0, msg_callback_arg = 0x0, hit = 0, param =
0x2b8fb1c08d50, cipher_list = 0x0, cipher_list_by_id = 0x0, mac_flags = 0,
enc_read_ctx = 0x2b8fb0b043c0, read_hash = 0x2b92452a86c0, expand = 0x0,
enc_write_ctx = 0x2b9245357a90, write_hash = 0x2b8fb113d3b0, compress = 0x0,
cert = 0x2b9244ccf6f0, sid_ctx_length = 0, sid_ctx = '\000' <repeats 31
times>, session = 0x2b8fb381ac20, generate_session_id = 0, verify_mode = 0,
verify_callback = 0,
info_callback = 0, error = 0, error_code = 0, kssl_ctx = 0x2b8fb2ddce10,
psk_client_callback = 0, psk_server_callback = 0, ctx = 0x1badc00, debug = 0,
verify_result = 0,
ex_data = {sk = 0x2b8fb202fdc0, dummy = 0}, client_CA = 0x0, references = 1,
options = 2170227703, mode = 16, max_cert_list = 102400, first_packet = 0,
client_version = 771, max_send_fragment = 16384, tlsext_debug_cb = 0,
tlsext_debug_arg = 0x0, tlsext_hostname = 0x0, servername_done = 1,
tlsext_status_type = -1,
tlsext_status_expected = 0, tlsext_ocsp_ids = 0x0, tlsext_ocsp_exts = 0x0,
tlsext_ocsp_resp = 0x0, tlsext_ocsp_resplen = -1, tlsext_ticket_expected = 1,
tlsext_ecpointformatlist_length = 3, tlsext_ecpointformatlist =
0x2b92455898b0 <Address 0x2b92455898b0 out of bounds>,
tlsext_ellipticcurvelist_length = 0,
tlsext_ellipticcurvelist = 0x0, tlsext_opaque_prf_input = 0x0,
tlsext_opaque_prf_input_len = 0, tlsext_session_ticket = 0x0,
tls_session_ticket_ext_cb = 0,
tls_session_ticket_ext_cb_arg = 0x0, tls_session_secret_cb = 0,
tls_session_secret_cb_arg = 0x0, initial_ctx = 0x1badc00,
next_proto_negotiated = 0x2b8fb2ff04e0 "spdy/3.1\260\323\023\261\217+",
next_proto_negotiated_len = 8 '\b', srtp_profiles = 0x0, srtp_profile = 0x0,
tlsext_heartbeat = 0, tlsext_hb_pending = 0, tlsext_hb_seq = 0, renegotiate =
0}
(gdb) print b
$4 = 0
(gdb) up
#3 SSLNetVConnection::load_buffer_and_write (this=0x2b8fff268640,
towrite=255944, wattempted=@0x2b8edc402c28, total_wrote=@0x2b8edc402c30,
buf=<value optimized out>,
needs=@0x2b8edc402c38) at SSLNetVConnection.cc:399
399 in SSLNetVConnection.cc
(gdb) print b
$5 = (IOBufferBlock *) 0x2b91e4424e40
(gdb) print ssl
$6 = (SSL *) 0x2b8fb3b8be70
(gdb) print *b
Cannot access memory at address 0x2b91e4424e40
(gdb) print b->start()
Cannot evaluate function -- may be inlined
(gdb) print l
$7 = 4096
(gdb) print (IOBufferBlock *)b
$8 = (IOBufferBlock *) 0x2b91e4424e40
(gdb) print *(IOBufferBlock *)b
Cannot access memory at address 0x2b91e4424e40
(gdb) print offset
$9 = 0
{code}
> Core dump inside openssl library
> --------------------------------
>
> Key: TS-2776
> URL: https://issues.apache.org/jira/browse/TS-2776
> Project: Traffic Server
> Issue Type: Bug
> Components: SPDY, SSL
> Reporter: Sudheer Vinukonda
> Assignee: Bryan Call
> Priority: Blocker
> Labels: spdy, yahoo
> Fix For: 5.0.0
>
>
> During production testing of SPDY (w/ ATS compiled from master repo), noticed
> the below core from open ssl library. This core showed up after fixing a few
> other core dumps and memory leaks (refer TS-2742, TS-2743, TS-2750, TS-2765,
> TS-2767 etc) and happened once so far, after running stable for 30+ hours on
> a production host.
> {code}
> [example_prep.sh] Checking/Moving old cores...
> [TrafficServer] using root directory '/home/y'
> [TrafficManager] ==> Cleaning up and reissuing signal #15
> [TrafficManager] ==> signal #15
> [E. Mgmt] log ==> [TrafficManager] using root directory '/home/y'
> [example_prep.sh] Checking/Moving old cores...
> [TrafficServer] using root directory '/home/y'
> NOTE: Traffic Server received Sig 11: Segmentation fault
> /home/y/bin/traffic_server - STACK TRACE:
> /lib64/libpthread.so.0(+0x329720f500)[0x2b7dff1f6500]
> /usr/lib64/libssl.so.10[0x331622b2e7]
> /usr/lib64/libssl.so.10(ssl3_write_bytes+0x75)[0x331622b905]
> /home/y/bin/traffic_server(_ZN17SSLNetVConnection21load_buffer_and_writeElRlS0_R17MIOBufferAccessorRi+0xdf)[0x6db57f]
> /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x418)[0x6ef488]
> /home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x283)[0x6e3ee3]
> /home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0x8f)[0x7109ff]
> /home/y/bin/traffic_server(_ZN7EThread7executeEv+0x43b)[0x71122b]
> /home/y/bin/traffic_server[0x70fdaa]
> /lib64/libpthread.so.0(+0x3297207851)[0x2b7dff1ee851]
> /lib64/libc.so.6(clone+0x6d)[0x3296ee890d]
> [E. Mgmt] log ==> [TrafficManager] using root directory '/home/y'
> [example_prep.sh] Checking/Moving old cores...
> [TrafficServer] using root directory '/home/y'
> {code}
> gdb output below:
> {code}
> (gdb) bt
> #0 0x000000331622b2e7 in ?? () from /usr/lib64/libssl.so.10
> #1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
> #2 0x00000000006db57f in do_SSL_write (this=0x2b7f42c7ca70, towrite=207693,
> wattempted=@0x2b7e111aec28, total_wrote=@0x2b7e111aec30, buf=<value optimized
> out>, needs=@0x2b7e111aec38)
> at SSLNetVConnection.cc:90
> #3 SSLNetVConnection::load_buffer_and_write (this=0x2b7f42c7ca70,
> towrite=207693, wattempted=@0x2b7e111aec28, total_wrote=@0x2b7e111aec30,
> buf=<value optimized out>, needs=@0x2b7e111aec38)
> at SSLNetVConnection.cc:399
> #4 0x00000000006ef488 in write_to_net_io (nh=0x2b7e0f89ac10,
> vc=0x2b7f42c7ca70, thread=0x2b7e0f897010) at UnixNetVConnection.cc:527
> #5 0x00000000006e3ee3 in NetHandler::mainNetEvent (this=0x2b7e0f89ac10,
> event=<value optimized out>, e=<value optimized out>) at UnixNet.cc:400
> #6 0x00000000007109ff in handleEvent (this=0x2b7e0f897010, e=0x31ae0d0,
> calling_code=5) at I_Continuation.h:146
> #7 EThread::process_event (this=0x2b7e0f897010, e=0x31ae0d0, calling_code=5)
> at UnixEThread.cc:145
> #8 0x000000000071122b in EThread::execute (this=0x2b7e0f897010) at
> UnixEThread.cc:269
> #9 0x000000000070fdaa in spawn_thread_internal (a=0x2f8fbe0) at Thread.cc:88
> #10 0x00002b7dff1ee851 in start_thread () from /lib64/libpthread.so.0
> #11 0x0000003296ee890d in clone () from /lib64/libc.so.6
> (gdb) up
> #1 0x000000331622b905 in ssl3_write_bytes () from /usr/lib64/libssl.so.10
> (gdb) up
> #2 0x00000000006db57f in do_SSL_write (this=0x2b7f42c7ca70, towrite=207693,
> wattempted=@0x2b7e111aec28, total_wrote=@0x2b7e111aec30, buf=<value optimized
> out>, needs=@0x2b7e111aec38)
> at SSLNetVConnection.cc:90
> 90 SSLNetVConnection.cc: No such file or directory.
> in SSLNetVConnection.cc
> (gdb)
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)