Vinod Kone created MESOS-300:
--------------------------------
Summary: Libprocess throws exception in SocketManager::next()
Key: MESOS-300
URL: https://issues.apache.org/jira/browse/MESOS-300
Project: Mesos
Issue Type: Bug
Reporter: Vinod Kone
Assignee: Benjamin Hindman
Came across this while I was debugging an issue at Twitter.
{code}
I1025 18:34:52.799145 56374 dominant_share_allocator.cpp:417] Performed
allocation for 1004 slaves in 337.449 milliseconds
F1025 18:34:53.633313 56380 process.cpp:1827] Check failed: outgoing.count(s) >
0
*** Check failure stack trace: ***
@ 0x7f68b604f03d google::LogMessage::Fail()
@ 0x7f68b6054ca7 google::LogMessage::SendToLog()
@ 0x7f68b60508ec google::LogMessage::Flush()
@ 0x7f68b6050b56 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f68b5f3679c process::SocketManager::next()
@ 0x7f68b5f37704 process::send_data()
@ 0x7f68b60940e3 ev_invoke_pending
@ 0x7f68b6099518 ev_loop
@ 0x7f68b5f3332a process::serve()
@ 0x7f68b531e73d start_thread
@ 0x7f68b4908f6d clone
Bottle server starting up (using WSGIRefServer())...
Listening on http://0.0.0.0:8080/
Use Ctrl-C to quit.
{code}
Grokking at the code, there is a huge comment stating we cannot/shouldn't be
doing this check. right above where this check happens.
{code}
Encoder* SocketManager::next(int s)
{
HttpProxy* proxy = NULL; // Non-null if needs to be terminated.
synchronized (this) {
// We cannot assume 'sockets.count(s) > 0' here because it's
// possible that 's' has been removed with a a call to
// SocketManager::close. For example, it could be the case that a
// socket has gone to CLOSE_WAIT and the call to 'recv' in
// recv_data returned 0 causing SocketManager::close to get
// invoked. Later a call to 'send' or 'sendfile' (e.g., in
// send_data or send_file) can "succeed" (because the socket is
// not "closed" yet because there are still some Socket
// references, namely the reference being used in send_data or
// send_file!). However, when SocketManger::next is actually
// invoked we find out there there is no more data and thus stop
// sending.
// TODO(benh): Should we actually finish sending the data!?
if (sockets.count(s) > 0) {
CHECK(outgoing.count(s) > 0);
{code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira