We recently upgraded our Mesos  cluster from version 1.3 to 1.5, and since
then have been getting periodic master crashes due to this error:

Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]: F0205 15:53:57.385118
8434 hierarchical.cpp:2630] Check failed:
reservationScalarQuantities.contains(role)

Full stack trace is at the end of this email. When the master fails, we
automatically restart it and it rejoins the cluster just fine. I did some
initial searching and was unable to find any existing bug reports or other
people experiencing this issue. We run a cluster of 3 masters, and see
crashes on all 3 instances.

Hope to get some guidance on what is going on and/or where to start looking
for more information.

Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e9170a7d  google::LogMessage::Fail()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e9172830  google::LogMessage::SendToLog()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e9170663  google::LogMessage::Flush()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e9173259  google::LogMessageFatal::~LogMessageFatal()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e8443cbd
mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::untrackReservations()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e8448fcd
mesos::internal::master::allocator::internal::HierarchicalAllocatorProcess::removeSlave()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e90c4f11  process::ProcessBase::consume()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e90dea4a  process::ProcessManager::resume()
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e90e25d6
_ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e6700c80  (unknown)
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e5f136ba  start_thread
Feb  5 15:53:57 ip-10-0-16-140 mesos-master[8414]:     @
 0x7f87e5c4941d  (unknown)

Reply via email to