[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)

2015-12-04 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15042402#comment-15042402
 ] 

Joris Van Remoortere commented on MESOS-3397:
-

As per a discussion on IRC and related to MESOS-4071, this is very likely 
caused by resource math deltas triggering different logical branches in the 
code.

> sorter.cpp: Check failed: total.resources.contains(slaveId)
> ---
>
> Key: MESOS-3397
> URL: https://issues.apache.org/jira/browse/MESOS-3397
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Yan Xu
>
> Observed in production.
> {noformat:title=}
> F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: 
> total.resources.contains(slaveId)
> *** Check failure stack trace: ***
> @ 0x7f772cdb10bd  google::LogMessage::Fail()
> @ 0x7f772cdb2f04  google::LogMessage::SendToLog()
> @ 0x7f772cdb0cac  google::LogMessage::Flush()
> @ 0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f772c8162d0  
> mesos::internal::master::allocator::DRFSorter::remove()
> @ 0x7f772c6f61bc  
> mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
> @ 0x7f772cd61f09  process::ProcessManager::resume()
> @ 0x7f772cd6220f  process::internal::schedule()
> @ 0x7f772ce73610  execute_native_thread_routine
> @ 0x7f772bcb883d  start_thread
> @ 0x7f772b4aafdd  clone
> {noformat}
> This is following a framework removal:
> {noformat:title=}
> I0908 23:21:10.619640  6884 master.cpp:4261] Framework failover timeout, 
> removing framework 20150813-182946-1685138442-5050-58479-0425 (Some 
> Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)

2015-11-16 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15008156#comment-15008156
 ] 

Neil Conway commented on MESOS-3397:


This is the same assertion failure as MESOS-3744, which also seems related to 
MESOS-3719.

> sorter.cpp: Check failed: total.resources.contains(slaveId)
> ---
>
> Key: MESOS-3397
> URL: https://issues.apache.org/jira/browse/MESOS-3397
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Yan Xu
>
> Observed in production.
> {noformat:title=}
> F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: 
> total.resources.contains(slaveId)
> *** Check failure stack trace: ***
> @ 0x7f772cdb10bd  google::LogMessage::Fail()
> @ 0x7f772cdb2f04  google::LogMessage::SendToLog()
> @ 0x7f772cdb0cac  google::LogMessage::Flush()
> @ 0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f772c8162d0  
> mesos::internal::master::allocator::DRFSorter::remove()
> @ 0x7f772c6f61bc  
> mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
> @ 0x7f772cd61f09  process::ProcessManager::resume()
> @ 0x7f772cd6220f  process::internal::schedule()
> @ 0x7f772ce73610  execute_native_thread_routine
> @ 0x7f772bcb883d  start_thread
> @ 0x7f772b4aafdd  clone
> {noformat}
> This is following a framework removal:
> {noformat:title=}
> I0908 23:21:10.619640  6884 master.cpp:4261] Framework failover timeout, 
> removing framework 20150813-182946-1685138442-5050-58479-0425 (Some 
> Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3397) sorter.cpp: Check failed: total.resources.contains(slaveId)

2015-09-08 Thread Qian Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14735890#comment-14735890
 ] 

Qian Zhang commented on MESOS-3397:
---

[~xujyan], any detailed reproduce steps for this bug?

> sorter.cpp: Check failed: total.resources.contains(slaveId)
> ---
>
> Key: MESOS-3397
> URL: https://issues.apache.org/jira/browse/MESOS-3397
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.0
>Reporter: Yan Xu
>
> Observed in production.
> {noformat:title=}
> F0908 23:21:10.635751  6884 sorter.cpp:213] Check failed: 
> total.resources.contains(slaveId)
> *** Check failure stack trace: ***
> @ 0x7f772cdb10bd  google::LogMessage::Fail()
> @ 0x7f772cdb2f04  google::LogMessage::SendToLog()
> @ 0x7f772cdb0cac  google::LogMessage::Flush()
> @ 0x7f772cdb37f9  google::LogMessageFatal::~LogMessageFatal()
> @ 0x7f772c8162d0  
> mesos::internal::master::allocator::DRFSorter::remove()
> @ 0x7f772c6f61bc  
> mesos::internal::master::allocator::HierarchicalAllocatorProcess<>::removeFramework()
> @ 0x7f772cd61f09  process::ProcessManager::resume()
> @ 0x7f772cd6220f  process::internal::schedule()
> @ 0x7f772ce73610  execute_native_thread_routine
> @ 0x7f772bcb883d  start_thread
> @ 0x7f772b4aafdd  clone
> {noformat}
> This is following a framework removal:
> {noformat:title=}
> I0908 23:21:10.619640  6884 master.cpp:4261] Framework failover timeout, 
> removing framework 20150813-182946-1685138442-5050-58479-0425 (Some 
> Scheduler) at scheduler-3c50e28c-a0f4-4619-8ea0-b786744e6e54@x.y.z.a:33952
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)