[ 
https://issues.apache.org/jira/browse/MESOS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085645#comment-14085645
 ] 

Bernd Mathiske commented on MESOS-1630:
---------------------------------------

[[email protected]] Let's first make sure I understand what is asked here. My 
reading of this issue description is as follows. 

1. A framework runs to completion or is otherwise removed and therefore 
regarded as "completed". 
2. No more tasks of it are running.
3. A completely unrelated framework starts up and is assigned the exact same 
framework ID.
4. It starts new tasks. 
5. Master::reconcile then asks the slave to shut them down instead of letting 
them run.

The fix is to "forget" that a framework with the reused ID is "among the 
completed".

If that is all correct and that is the accurate fix, then I should be able to 
program this very quickly, indeed. The test may take longer, though. Let me get 
back to you tomorrow on that.

> Remove framework from completedFrameworks if framework re-registers.
> --------------------------------------------------------------------
>
>                 Key: MESOS-1630
>                 URL: https://issues.apache.org/jira/browse/MESOS-1630
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.17.0, 0.16.0, 0.15.0, 0.18.0, 
> 0.18.1, 0.18.2, 0.19.0, 0.19.1
>            Reporter: Benjamin Hindman
>            Assignee: Bernd Mathiske
>            Priority: Critical
>
> If a framework gets removed, for example, because it unregisters with the 
> master (i.e., due to MESOS-1550), but then the same framework ID is reused 
> when a framework re-registers (which we currently allow) then we should 
> remove the framework from Master::completedFrameworks otherwise when a slave 
> re-registers then in Master::reconcile we'll notice that the slave is 
> runnings tasks from a completed framework and tell the slave to shutdown that 
> framework, thus shutting down all of the tasks.
> This should be easily fixed by removing the framework from 
> completedFrameworks when a framework re-registers with the same ID as a 
> completed framework. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to