Greg Mann created MESOS-5986:
--------------------------------
Summary: SSL Socket CHECK can fail when socket is destroyed
Key: MESOS-5986
URL: https://issues.apache.org/jira/browse/MESOS-5986
Project: Mesos
Issue Type: Bug
Reporter: Greg Mann
Priority: Blocker
Fix For: 1.0.1
While writing a test for MESOS-3753, I encountered a bug where [this
check|https://github.com/apache/mesos/blob/853821cafcca3550b9c7bdaba5262d73869e2ee1/3rdparty/libprocess/src/libevent_ssl_socket.cpp#L708]
fails at the very end of the test body, while objects in the stack frame are
being destroyed. After adding some debug logging output, I produced the
following:
{code}
I0804 08:32:33.263211 273793024 libevent_ssl_socket.cpp:681] *** in send()17
I0804 08:32:33.263209 273256448 process.cpp:2970] Cleaning up
__limiter__(3)@127.0.0.1:55688
I0804 08:32:33.263263 275939328 libevent_ssl_socket.cpp:152] *** in
initialize(): 14
I0804 08:32:33.263206 272719872 process.cpp:2865] Resuming (61)@127.0.0.1:55688
at 2016-08-04 15:32:33.263261952+00:00
I0804 08:32:33.263327 275939328 libevent_ssl_socket.cpp:584] *** in recv()14
I0804 08:32:33.263337 272719872 hierarchical.cpp:571] Agent
e2a49340-34ec-403f-a5a4-15e29c4a2434-S0 deactivated
I0804 08:32:33.263322 275402752 process.cpp:2865] Resuming [email protected]:55688
at 2016-08-04 15:32:33.263343104+00:00
I0804 08:32:33.263510 275939328 libevent_ssl_socket.cpp:322] *** in
event_callback(bev)
I0804 08:32:33.263536 275939328 libevent_ssl_socket.cpp:353] *** in
event_callback check for EOF/CONNECTED/ERROR: 19
I0804 08:32:33.263592 275939328 libevent_ssl_socket.cpp:159] *** in shutdown():
19
I0804 08:32:33.263622 1985901312 process.cpp:3170] Donating thread to
(87)@127.0.0.1:55688 while waiting
I0804 08:32:33.263639 274329600 process.cpp:2865] Resuming
__http__(12)@127.0.0.1:55688 at 2016-08-04 15:32:33.263653888+00:00
I0804 08:32:33.263659 1985901312 process.cpp:2865] Resuming
(87)@127.0.0.1:55688 at 2016-08-04 15:32:33.263671040+00:00
I0804 08:32:33.263730 1985901312 process.cpp:2970] Cleaning up
(87)@127.0.0.1:55688
I0804 08:32:33.263741 275939328 libevent_ssl_socket.cpp:322] *** in
event_callback(bev)
I0804 08:32:33.263736 274329600 process.cpp:2970] Cleaning up
__http__(12)@127.0.0.1:55688
I0804 08:32:33.263778 275939328 libevent_ssl_socket.cpp:353] *** in
event_callback check for EOF/CONNECTED/ERROR: 17
I0804 08:32:33.263818 275939328 libevent_ssl_socket.cpp:159] *** in shutdown():
17
I0804 08:32:33.263839 272183296 process.cpp:2865] Resuming [email protected]:55688
at 2016-08-04 15:32:33.263857920+00:00
I0804 08:32:33.263933 273793024 process.cpp:2865] Resuming
[email protected]:55688 at 2016-08-04 15:32:33.263951104+00:00
I0804 08:32:33.264034 275939328 libevent_ssl_socket.cpp:681] *** in send()17
I0804 08:32:33.264020 272719872 process.cpp:2865] Resuming
__http__(11)@127.0.0.1:55688 at 2016-08-04 15:32:33.264041984+00:00
I0804 08:32:33.264036 274329600 process.cpp:2865] Resuming
status-update-manager(3)@127.0.0.1:55688 at 2016-08-04 15:32:33.264056064+00:00
I0804 08:32:33.264071 272719872 process.cpp:2970] Cleaning up
__http__(11)@127.0.0.1:55688
I0804 08:32:33.264088 274329600 process.cpp:2970] Cleaning up
status-update-manager(3)@127.0.0.1:55688
I0804 08:32:33.264086 275939328 libevent_ssl_socket.cpp:721] *** sending on
socket: 17, data: 0
I0804 08:32:33.264112 272183296 process.cpp:2865] Resuming (89)@127.0.0.1:55688
at 2016-08-04 15:32:33.264126976+00:00
I0804 08:32:33.264118 275402752 process.cpp:2865] Resuming [email protected]:55688
at 2016-08-04 15:32:33.264144896+00:00
I0804 08:32:33.264149 272183296 process.cpp:2970] Cleaning up
(89)@127.0.0.1:55688
I0804 08:32:33.264202 275939328 libevent_ssl_socket.cpp:281] *** in
send_callback(bev)
I0804 08:32:33.264400 273793024 process.cpp:3170] Donating thread to
(86)@127.0.0.1:55688 while waiting
I0804 08:32:33.264413 273256448 process.cpp:2865] Resuming (76)@127.0.0.1:55688
at 2016-08-04 15:32:33.264428032+00:00
I0804 08:32:33.296268 275939328 libevent_ssl_socket.cpp:300] *** in
send_callback(): 17
I0804 08:32:33.296419 273256448 process.cpp:2970] Cleaning up
(76)@127.0.0.1:55688
I0804 08:32:33.296357 273793024 process.cpp:2865] Resuming (86)@127.0.0.1:55688
at 2016-08-04 15:32:33.296414976+00:00
I0804 08:32:33.296464 273793024 process.cpp:2970] Cleaning up
(86)@127.0.0.1:55688
I0804 08:32:33.296497 275939328 libevent_ssl_socket.cpp:104] *** releasing SSL
socket
I0804 08:32:33.296517 275939328 libevent_ssl_socket.cpp:106] *** released SSL
socket: 19
I0804 08:32:33.296515 274329600 process.cpp:2865] Resuming [email protected]:55688
at 2016-08-04 15:32:33.296532992+00:00
I0804 08:32:33.296550 275939328 libevent_ssl_socket.cpp:721] *** sending on
socket: 17, data: 0
I0804 08:32:33.296583 273793024 process.cpp:2865] Resuming (77)@127.0.0.1:55688
at 2016-08-04 15:32:33.296616960+00:00
F0804 08:32:33.296623 275939328 libevent_ssl_socket.cpp:723] Check failed:
'self->send_request.get()' Must be non NULL
*** Check failure stack trace: ***
{code}
The {{in send()17}} line indicates the beginning of {{send()}} for the SSL
socket using FD 17. {{in shutdown(): 17}} indicates the beginning of
{{shutdown()}} for the same socket, while {{sending on socket: 17}} indicates
the execution of the lambda from {{send()}} on the event loop. Since
{{shutdown()}} was called in between the call to {{send()}} and the execution
of its lambda, it looks like the {{Socket}} was destroyed before the lambda
could run. It's unclear why this would happen, since {{send()}}'s lambda
captures a shared copy of the socket's {{this}} pointer in order to keep it
alive.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)