[ 
https://issues.apache.org/jira/browse/MESOS-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15749508#comment-15749508
 ] 

Neil Conway commented on MESOS-6785:
------------------------------------

Notes:

* We can't prevent task IDs being reused unless we do something drastic, like 
constraining how frameworks are allowed to pick task IDs or having the master 
assign task IDs.
* Hence, we must either tolerate multiple tasks with the same ID (on different 
agents), or terminate one of them. (Note that there might be many duplicates of 
a given task ID on different partitioned agents -- we'd want the master to 
eventually terminate all-but-one of them, assuming they all eventually 
re-register).
* Allowing multiple tasks with the same ID on different agents seems like a 
breaking semantic change -- frameworks probably use Task IDs as unique 
identifiers, for good reason.

Assuming we want to terminate all-but-one of the copies of the task:

* When an agent re-registers and we discover that it is running a task whose ID 
is being used by another task on a different agent, we want to kill one of the 
tasks (hard to guarantee we always kill the "oldest" or "newest" copy of the 
task, since the agents might re-register in arbitrary order). It is unclear how 
to signal this situation to the framework: if we report "task X has been 
killed", the framework won't be able to tell which instance of the task "X" 
refers to.
* The task we want to kill may have generated one or more status updates at the 
agent while it was partitioned. We don't want to propagate those status updates 
to the framework (to avoid confusing it).
* To deal with the status update problem, we could:
 ## Send a special "kill" signal to the agent (likely as part of the 
{{SlaveReregisteredMessage}}); this would notify the agent to terminate the 
task without generating any status updates for it, and to drop any pending 
status updates without waiting for ACKs.
*** In this scheme, the master would never add the duplicate task on the 
re-registering agent to its in-memory state; this avoids the {{CHECK}} failure.
*** Because the kill signal would be delivered as part of the re-registration 
message, I think we could be sure that the master wouldn't receive any status 
updates for the task in the meantime (but if it did, we could arrange for the 
master to drop them).
 ## Or, we could have the master ACK and drop the resulting status updates from 
the agent, without passing them along to the framework.
*** This might be challenging, because the master might "forget" (due to master 
failover) that the copy of the task on the agent is "bad" and should be 
terminated in this special manner. So it might be possible to have a situation 
in which _some_ of the status updates for a copy of the task are dropped; then 
the master fails over and after re-registration, a _different_ version of the 
task is picked to be killed, so we'd effectively have silently dropped some of 
the status updates from the "legitimate" copy of the task.

Note that even if we fix the master crash, this situation is likely to be 
problematic for frameworks. For example, suppose the framework launches task X 
on agent A1, then task X on agent A2, then the framework itself fails. When it 
reconnects, it finds a single copy of X running -- it could be _either_ the X 
on A1 or A2. Without having the framework also remember the agent ID where it 
launched the task, the framework can't determine which "X" is currently 
running. (And if we require frameworks to identify tasks via the pair <task ID, 
agent ID>, we might as well just declare that task IDs are no longer globally 
unique and be done with it.)

> CHECK failure on duplicate task IDs
> -----------------------------------
>
>                 Key: MESOS-6785
>                 URL: https://issues.apache.org/jira/browse/MESOS-6785
>             Project: Mesos
>          Issue Type: Bug
>          Components: master
>            Reporter: Neil Conway
>            Assignee: Neil Conway
>              Labels: mesosphere
>
> The master crashes with a CHECK failure in the following scenario:
> # Framework launches task X on agent A1. The framework may or may not be 
> partition-aware; let's assume it is not partition-aware.
> # A1 becomes partitioned from the master.
> # Framework launches task X on agent A2.
> # Master fails over.
> # Agents A1 and A2 both re-register with the master. Because the master has 
> failed over, the task on A1 is _not_ terminated ("non-strict registry 
> semantics").
> This results in two running tasks with the same ID, which causes a master 
> {{CHECK}} failure among other badness:
> {noformat}
> master.hpp:2299] Check failed: !tasks.contains(task->task_id()) Duplicate 
> task b88153a2-571a-41e7-9e9b-c297fef4f3cd of framework 
> eaef1879-8cc9-412f-928d-86c9925a7abb-0000
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to