[ 
https://issues.apache.org/jira/browse/MESOS-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-152:
----------------------------------

    Fix Version/s:     (was: 0.10.0)
                   0.12.0
    
> Slave should forward status updates for unknown tasks
> -----------------------------------------------------
>
>                 Key: MESOS-152
>                 URL: https://issues.apache.org/jira/browse/MESOS-152
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Bill Farner
>            Assignee: Vinod Kone
>             Fix For: 0.12.0
>
>
> The slave swallows status updates for tasks that it does not recognize.  Due 
> to the way we handle tasks and history in the twitter framework, it would be 
> ideal if these messages were passed along.
> Relevant code in slave.cpp:
>     Executor* executor = framework->getExecutor(status.task_id());
>     if (executor != NULL) {
>       executor->updateTaskState(status.task_id(), status.state());
>       // Handle the task appropriately if it's terminated.
>       if (status.state() == TASK_FINISHED ||
>           status.state() == TASK_FAILED ||
>           status.state() == TASK_KILLED ||
>           status.state() == TASK_LOST) {
>         executor->removeTask(status.task_id());
>         dispatch(isolationModule,
>                  &IsolationModule::resourcesChanged,
>                  framework->id, executor->id, executor->resources);
>       }
>       // Send message and record the status for possible resending.
>       StatusUpdateMessage message;
>       message.mutable_update()->MergeFrom(update);
>       message.set_pid(self());
>       send(master, message);
>       UUID uuid = UUID::fromBytes(update.uuid());
>       // Send us a message to try and resend after some delay.
>       delay(STATUS_UPDATE_RETRY_INTERVAL_SECONDS,
>             self(), &Slave::statusUpdateTimeout,
>             framework->id, uuid);
>       framework->updates[uuid] = update;
>       stats.tasks[status.state()]++;
>       stats.validStatusUpdates++;
>     } else {
>       LOG(WARNING) << "Status update error: couldn't lookup "
>                    << "executor for framework " << update.framework_id();
>       stats.invalidStatusUpdates++;
>     }
> Ideally, this code would behave more like:
>   Look up executor
>   if executor exists:
>     Update executor state
>   else:
>     Log warning
>   send message
> Of course, this is still in a scope where the framework is known.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to