[ 
https://issues.apache.org/jira/browse/MESOS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644376#comment-14644376
 ] 

Klaus Ma commented on MESOS-3070:
---------------------------------

There're some options in my lists:
    1. add framework/tasks info into registry; but it'll increase the size of 
registery
    2. kill/cancel one of the two tasks

Currently, i'm following #2 to kill/cancel one of the two tasks. We can not 
just replace CHECK with error log; because the old task's running in slave, 
which will update status.

completedTasks in Master has similar issue, we need to handle together in this 
ticket.

If any comments or suggestion, please let me know.

> Master CHECK failure if a framework uses duplicated task id.
> ------------------------------------------------------------
>
>                 Key: MESOS-3070
>                 URL: https://issues.apache.org/jira/browse/MESOS-3070
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.22.1
>            Reporter: Jie Yu
>            Assignee: Klaus Ma
>
> We observed this in one of our testing cluster.
> One framework (under development) keeps launching tasks using the same 
> task_id. We don't expect the master to crash even if the framework is not 
> doing what it's supposed to do. However, under a series of events, this could 
> happen and keeps crashing the master.
> 1) frameworkA launches task 'task_id_1' on slaveA
> 2) master fails over
> 3) slaveA has not re-registered yet
> 4) frameworkA re-registered and launches task 'task_id_1' on slaveB
> 5) slaveA re-registering and add task "task_id_1' to frameworkA
> 6) CHECK failure in addTask
> {noformat}
> I0716 21:52:50.759305 28805 master.hpp:159] Adding task 'task_id_1' with 
> resources cpus(*):4; mem(*):32768 on slave 
> 20150417-232509-1735470090-5050-48870-S25 (hostname)
> ...
> ...
> F0716 21:52:50.760136 28805 master.hpp:362] Check failed: 
> !tasks.contains(task->task_id()) Duplicate task 'task_id_1' of framework 
> <framework_id>
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to