[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)
[ https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15042402#comment-15042402 ] Joris Van Remoortere commented on MESOS-3397: - As per a discussion on IRC and related to MESOS-4071, this is very likely caused by resource math deltas triggering different logical branches in the code. > sorter.cpp: Check failed: total.resources.contains(slaveId) > --- > > Key: MESOS-3397 > URL: https://issues.apache.org/jira/browse/MESOS-3397 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0 >Reporter: Yan Xu > > Observed in production. > {noformat:title=} > F0908 23:21:10.635751 6884 sorter.cpp:213] Check failed: > total.resources.contains(slaveId) > *** Check failure stack trace: *** > @ 0x7f772cdb10bd google::LogMessage::Fail() > @ 0x7f772cdb2f04 google::LogMessage::SendToLog() > @ 0x7f772cdb0cac google::LogMessage::Flush() > @ 0x7f772cdb37f9 google::LogMessageFatal::~LogMessageFatal() > @ 0x7f772c8162d0 > mesos::internal::master::allocator::DRFSorter::remove() > @ 0x7f772c6f61bc > mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework() > @ 0x7f772cd61f09 process::ProcessManager::resume() > @ 0x7f772cd6220f process::internal::schedule() > @ 0x7f772ce73610 execute_native_thread_routine > @ 0x7f772bcb883d start_thread > @ 0x7f772b4aafdd clone > {noformat} > This is following a framework removal: > {noformat:title=} > I0908 23:21:10.619640 6884 master.cpp:4261] Framework failover timeout, > removing framework 20150813-182946-1685138442-5050-58479-0425 (Some > Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)
[ https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008156#comment-15008156 ] Neil Conway commented on MESOS-3397: This is the same assertion failure as MESOS-3744, which also seems related to MESOS-3719. > sorter.cpp: Check failed: total.resources.contains(slaveId) > --- > > Key: MESOS-3397 > URL: https://issues.apache.org/jira/browse/MESOS-3397 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0 >Reporter: Yan Xu > > Observed in production. > {noformat:title=} > F0908 23:21:10.635751 6884 sorter.cpp:213] Check failed: > total.resources.contains(slaveId) > *** Check failure stack trace: *** > @ 0x7f772cdb10bd google::LogMessage::Fail() > @ 0x7f772cdb2f04 google::LogMessage::SendToLog() > @ 0x7f772cdb0cac google::LogMessage::Flush() > @ 0x7f772cdb37f9 google::LogMessageFatal::~LogMessageFatal() > @ 0x7f772c8162d0 > mesos::internal::master::allocator::DRFSorter::remove() > @ 0x7f772c6f61bc > mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework() > @ 0x7f772cd61f09 process::ProcessManager::resume() > @ 0x7f772cd6220f process::internal::schedule() > @ 0x7f772ce73610 execute_native_thread_routine > @ 0x7f772bcb883d start_thread > @ 0x7f772b4aafdd clone > {noformat} > This is following a framework removal: > {noformat:title=} > I0908 23:21:10.619640 6884 master.cpp:4261] Framework failover timeout, > removing framework 20150813-182946-1685138442-5050-58479-0425 (Some > Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)
[ https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735890#comment-14735890 ] Qian Zhang commented on MESOS-3397: --- [~xujyan], any detailed reproduce steps for this bug? > sorter.cpp: Check failed: total.resources.contains(slaveId) > --- > > Key: MESOS-3397 > URL: https://issues.apache.org/jira/browse/MESOS-3397 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.24.0 >Reporter: Yan Xu > > Observed in production. > {noformat:title=} > F0908 23:21:10.635751 6884 sorter.cpp:213] Check failed: > total.resources.contains(slaveId) > *** Check failure stack trace: *** > @ 0x7f772cdb10bd google::LogMessage::Fail() > @ 0x7f772cdb2f04 google::LogMessage::SendToLog() > @ 0x7f772cdb0cac google::LogMessage::Flush() > @ 0x7f772cdb37f9 google::LogMessageFatal::~LogMessageFatal() > @ 0x7f772c8162d0 > mesos::internal::master::allocator::DRFSorter::remove() > @ 0x7f772c6f61bc > mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework() > @ 0x7f772cd61f09 process::ProcessManager::resume() > @ 0x7f772cd6220f process::internal::schedule() > @ 0x7f772ce73610 execute_native_thread_routine > @ 0x7f772bcb883d start_thread > @ 0x7f772b4aafdd clone > {noformat} > This is following a framework removal: > {noformat:title=} > I0908 23:21:10.619640 6884 master.cpp:4261] Framework failover timeout, > removing framework 20150813-182946-1685138442-5050-58479-0425 (Some > Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)