Hey guys, Over the past couple of weeks I've been struggling with an old bug of Apache Mesos related to our implementation of SSL Sockets which are based on libevent, you can see the whole implementation in [1] and [2]. It affects later versions of Linux, like Ubuntu 17.10 and OSX, full description of the packages at MESOS-8271 [3].
The problem can be triggered simply by updating libevent from 2.0.22 to 2.18. In linux we see the issue when we call `accept()`. When we reach the event callback we keep getting `events & BEV_EVENT_ERROR && events & BEV_EVENT_READ` [4] however `EVUTIL_SOCKET_ERROR()` keeps returning 0, and `bufferevent_get_openssl_error(bev)` also shows no error. On the client we find that the connection was closed, but the symptoms are similar, we get `BEV_EVENT_ERROR` but no other diagnostics. I include a full stack trace of a run where the problem appears (Line numbers may be somewhat off due to debug output lines added). Any more info I am happy to contribute. ``` process::network::internal::LibeventSSLSocketImpl::accept_SSL_callback(process::network::internal::LibeventSSLSocketImpl::AcceptRequest*)::$_10::operator()(bufferevent*, short, void*) const (this=0x160eae0, bev=0x7fffe4001a10, events=33, arg=0x7fffc4000c10) at ../../../3rdparty/libprocess/src/libevent_ssl_socket.cpp:1152 process::network::internal::LibeventSSLSocketImpl::accept_SSL_callback(process::network::internal::LibeventSSLSocketImpl::AcceptRequest*)::$_10::__invoke(bufferevent*, short, void*) (bev=0x7fffe4001a10, events=33, arg=0x7fffc4000c10) at ../../../3rdparty/libprocess/src/libevent_ssl_socket.cpp:1143 ?? () from /usr/lib/x86_64-linux-gnu/libevent_openssl-2.1.so.6 ?? () from /usr/lib/x86_64-linux-gnu/libevent_openssl-2.1.so.6 ?? () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6 event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6 process::EventLoop::run () at ../../../3rdparty/libprocess/src/libevent.cpp:98 std::__invoke_impl<void, void (*)()> ( __f=@0x15cfc28: 0xd53680 <process::EventLoop::run()>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/bits/invoke.h:60 std::__invoke<void (*)()> ( __fn=@0x15cfc28: 0xd53680 <process::EventLoop::run()>) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/bits/invoke.h:95 std::thread::_Invoker<std::tuple<void (*)()> >::_M_invoke<0ul> (this=0x15cfc28) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/thread:234 std::thread::_Invoker<std::tuple<void (*)()> >::operator() (this=0x15cfc28) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/thread:243 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> > >::_M_run (this=0x15cfc20) at /usr/bin/../lib/gcc/x86_64-linux-gnu/7.2.0/../../../../include/c++/7.2.0/thread:186 ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 start_thread (arg=0x7fffe9e6b700) at pthread_create.c:465 clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ``` [1] https://github.com/apache/mesos/blob/32b85a2b06f676b68a16deaa8359ae64a1e8ead9/3rdparty/libprocess/src/libevent_ssl_socket.cpp [2] https://github.com/apache/mesos/blob/32b85a2b06f676b68a16deaa8359ae64a1e8ead9/3rdparty/libprocess/src/libevent_ssl_socket.hpp [3] https://issues.apache.org/jira/browse/MESOS-8271 [4] https://github.com/apache/mesos/blob/32b85a2b06f676b68a16deaa8359ae64a1e8ead9/3rdparty/libprocess/src/libevent_ssl_socket.cpp#L1176 -- Alexander Rojas [email protected] *********************************************************************** To unsubscribe, send an e-mail to [email protected] with unsubscribe libevent-users in the body.
