[
https://issues.apache.org/jira/browse/MESOS-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16424306#comment-16424306
]
ASF GitHub Bot commented on MESOS-8750:
---------------------------------------
Github user m9a commented on the issue:
https://github.com/apache/mesos/pull/279
The JIRA for this PR: https://issues.apache.org/jira/browse/MESOS-8750
Since @xujyan is shepherding it I intended to set him as the reviewer but
it doesn't look like I can change those fields on the PR.
> Check failed: !slaves.registered.contains(task->slave_id)
> ---------------------------------------------------------
>
> Key: MESOS-8750
> URL: https://issues.apache.org/jira/browse/MESOS-8750
> Project: Mesos
> Issue Type: Task
> Components: master
> Reporter: Megha Sharma
> Assignee: Megha Sharma
> Priority: Major
>
> It appears that in certain circumstances an unreachable task doesn't get
> cleaned up from the framework.unreachableTasks when the respective agent
> re-registers leading to this check failure later when the framework is being
> removed. When an agent goes unreachable master adds the tasks from this agent
> to framework.unreachableTasks and when such an agent re-registers the master
> removes the tasks that it specifies during re-registeration from this
> datastructure but there could be tasks that the agent doesn't know about e.g.
> if the runTask message for them got dropped and so such tasks will not get
> removed from unreachableTasks.
> F0112 21:50:39.272985 44038 master.cpp:9617] Check failed:
> !slaves.registered.contains(task->slave_id())
> Check failure stack trace: ***
> @ 0x7fb7260692bd (unknown)
> @ 0x7fb72606b04d (unknown)
> @ 0x7fb726068e42 (unknown)
> @ 0x7fb72606ba29 (unknown)
> @ 0x7fb7251f5226 (unknown)
> @ 0x7fb725120081 (unknown)
> @ 0x7fb72519ca37 (unknown)
> @ 0x7fb725fbb2fe (unknown)
> @ 0x7fb724f75de9 (unknown)
> @ 0x7fb725fb4fc2 (unknown)
> @ 0x7fb725fc4a17 (unknown)
> @ 0x7fb725fca276 (unknown)
> @ 0x7fb72352d470 (unknown)
> @ 0x7fb723784aa1 start_thread
> @ 0x7fb722f47bcd clone
> @ (nil) (unknown)
> Aborted
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)