Bernd Mathiske created MESOS-2068:
-------------------------------------
Summary: Add comments that explain framework, executor ID, and
task life cycle in slave
Key: MESOS-2068
URL: https://issues.apache.org/jira/browse/MESOS-2068
Project: Mesos
Issue Type: Improvement
Components: slave
Reporter: Bernd Mathiske
Assignee: Bernd Mathiske
Priority: Minor
Fixing MESOS-947 was relatively difficult because the source code is mostly the
only source of information with regard to the life cycle of frameworks,
executors, and tasks in the slave. In particular this leads to confusion about
whether there could be a task lost state at the beginning of _runTask() when
the framework is NULL. This shall be explained to the best of the assignees
knowledge.
For context see https://reviews.apache.org/r/27567
with these comments:
On Nov. 5, 2014, 7:50 p.m., Ben Mahler wrote:
src/slave/slave.cpp, lines 1195-1200
<https://reviews.apache.org/r/27567/diff/1/?file=748326#file748326line1195>
A comment here as to why we don't need to send TASK_LOST would be much
appreciated! It's not obvious so someone might come along and add a TASK_LOST
to make sure we're not dropping the task on the floor, so context here would be
great!
Bernd Mathiske wrote:
Hah, thanks for sharing - I am not alone! :-) None of this was obvious to me
either, because there is no comment explaining the general life cycle of
anything. Once you understand the intended life cycle, there is now way there
can be a TASK_LOST situation here, though. Therefore I propose adding comments
describing the overall picture regarding frameworks, executor IDs and task
creation in the appropriate places, instead. I'll file a ticket if you agree.
Once you understand the intended life cycle, there is now way there can be a
TASK_LOST situation here, though.
Phew! :)
Could you distill your learnings into a comment here, and maybe make the log
message more informative? Even with an overall description as you mentioned,
dummies like me would still get confused here given the lack of _local_
context. ;)
- Ben
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)