asekretenko commented on a change in pull request #408:
URL: https://github.com/apache/mesos/pull/408#discussion_r706588131



##########
File path: src/slave/slave.cpp
##########
@@ -3191,7 +3191,17 @@ void Slave::__run(
     if (taskGroup.isNone() && task->has_command()) {
       // We are dealing with command task; a new command executor will be
       // launched.
-      CHECK(executor == nullptr);
+      // It is possible for an executor with this ID to already exist, if the
+      // TaskID was re-used - see MESOS-9657. If this happens, we have no
+      // choice but to drop the task.
+      if (executor != nullptr) {
+        sendTaskDroppedUpdate(
+            TaskStatus::REASON_TASK_INVALID,
+            "Master wants to launch executor, but one already exists "

Review comment:
       Looks like in some/many cases it is the framework which is responsible 
for creating this situation? Prehaps something not attributing the error 
explicitly to the master could be better, like "Cannot reuse an already 
existing executor for a command task" ?
   
   Or do we have a similar check in master (unreliable, as master is not the 
source of truth about executors), and this only happens when the master is not 
aware that an executor with this ID already exists?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to