Benjamin Mahler created MESOS-1376:
--------------------------------------

             Summary: CHECK failure in the Registrar
                 Key: MESOS-1376
                 URL: https://issues.apache.org/jira/browse/MESOS-1376
             Project: Mesos
          Issue Type: Bug
          Components: master
    Affects Versions: 0.19.0
            Reporter: Benjamin Mahler
            Priority: Blocker
             Fix For: 0.19.0


{noformat}
I0515 05:44:37.049137  7179 master.cpp:2301] Ignoring re-register slave message 
from slave 20140416-015639-1890854154-5050-1354-24152 at 
slave(1)@10.34.119.132:5051 (smf1-aep-35-sr1.prod.twitter.com) as readmission 
is already in progress
E0515 05:44:37.271734  7168 registrar.cpp:500] Registrar aborting: Failed to 
update 'registry': Failed to perform store within 5secs
F0515 05:44:37.271728  7170 master.cpp:2341] Failed to readmit slave 
20140416-015639-1890854154-5050-1354-24133 at slave(1)@10.34.119.131:5051 
(smf1-aep-31-sr4.prod.twitter.com): Failed to update 'registry': Failed to 
perform store within 5secs
*** Check failure stack trace: ***
F0515 05:44:37.272384 7168 owned.hpp:103] Check failed: data->t != NULL This 
owned pointer has already been shared
*** Check failure stack trace: ***
    @     0x7f687d06e2ad  google::LogMessage::Fail()
    @     0x7f687d06e2ad  google::LogMessage::Fail()
    @     0x7f687d0700f4  google::LogMessage::SendToLog()
    @     0x7f687d0700f4  google::LogMessage::SendToLog()
    @     0x7f687d06de9c  google::LogMessage::Flush()
    @     0x7f687d06de9c  google::LogMessage::Flush()
    @     0x7f687d0709e9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f687d0709e9  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f687cc46182  process::Owned<>::get()
    @     0x7f687cbdaa41  mesos::internal::master::Master::_reregisterSlave()
    @     0x7f687cc46209  process::Owned<>::operator->()
    @     0x7f687cbe987a  
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal6master6MasterERKNS5_9SlaveInfoERKNS0_4UPIDERKSt6vectorINS5_12ExecutorInfoESaISG_EERKSF_INS6_4TaskESaISL_EERKSF_INS6_17Archive_FrameworkESaISQ_EERKNS0_6FutureIbEES9_SC_SI_SN_SS_SW_EEvRKNS0_3PIDIT_EEMS10_FvT0_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_T11_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
    @     0x7f687cc39e05  mesos::internal::master::fail()
    @     0x7f687cfa3c72  process::ProcessManager::resume()
    @     0x7f687cc39f97  mesos::internal::master::RegistrarProcess::abort()
    @     0x7f687cc3d77f  mesos::internal::master::RegistrarProcess::_update()
    @     0x7f687cfa3f6c  process::schedule()
    @     0x7f687c47883d  start_thread
    @     0x7f687cc47b27  
_ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal6master16RegistrarProcessERKNS0_6FutureI6OptionINS6_5state8protobuf8VariableINS6_8RegistryEEEEEESt5dequeINS0_5OwnedINS7_9OperationEEESaISN_EESH_SP_EEvRKNS0_3PIDIT_EEMSR_FvT0_T1_ET2_T3_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
    @     0x7f687b1e026d  clone
{noformat}

[~jieyu] pointed out the following problematic code:

{code}
// Helper for failing a deque of operations.
void fail(deque<Owned<Operation> >* operations, const string& message)
{
  while (!operations->empty()) {
    const Owned<Operation>& operation = operations->front(); // This reference 
becomes invalid!
    operations->pop_front();

    operation->fail(message);
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to