[ 
https://issues.apache.org/jira/browse/MESOS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091299#comment-14091299
 ] 

Vinod Kone commented on MESOS-1630:
-----------------------------------

You bring up good points [~adam-mesos], especially the scenario 1. Keeping the 
tasks around on slaves that were partitioned when the framework un-registered 
seems counter intuitive; although we would probably be in a similar situation 
if the master crashed in the middle of shutting down tasks/executors on slaves. 
A restarted/failed over master has no idea about framework removal.

Regarding implementation detail, it's probably simple and intuitive to just 
forget about any information in completedFrameworks. Mainly because, from 
framework's point of view, it likely wants to register as a new framework (true 
if framework un-registers and then registers; arguably true if it's after a 
failover timeout) but reuse the id.

> Remove framework from completedFrameworks if framework re-registers.
> --------------------------------------------------------------------
>
>                 Key: MESOS-1630
>                 URL: https://issues.apache.org/jira/browse/MESOS-1630
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.17.0, 0.16.0, 0.15.0, 0.18.0, 
> 0.18.1, 0.18.2, 0.19.0, 0.19.1
>            Reporter: Benjamin Hindman
>            Assignee: Bernd Mathiske
>            Priority: Critical
>
> If a framework gets removed, for example, because it unregisters with the 
> master (i.e., due to MESOS-1550), but then the same framework ID is reused 
> when a framework re-registers (which we currently allow) then we should 
> remove the framework from Master::frameworks.completed otherwise when a slave 
> re-registers then in Master::reconcile we'll notice that the slave is running 
> tasks from a "completed" framework and tell the slave to shutdown that 
> framework, thus shutting down all of the tasks.
> This should be easily fixed by removing the framework from 
> frameworks.completed when a framework re-registers with the same ID as a 
> completed framework. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to