The last comment (the one you cite) comes from a person, whose questions
and answers you'd better verify : ). Having said that, let me try to answer
your initial question.

The master does not always know the SlaveID for each task. Imagine a master
failover. In the registry there is just the a of connected slaves before
the previous master crashed. The mapping TaskID -> SlaveID is restored
during slave re-registration. If a framework specifies the SlaveID of the
task it wants to reconcile or kill, the master can check if the
corresponding slave has already reregistered and if so, execute the request
immediately. If the SlaveID is unknown, the master cannot really execute
the request until all slaves reregister.

Regarding Reconcile message: check this
commit: f95fa119044c9a11c8473ab088e948e7e1c1334d. It looks like we should
update reconciliation doc [1]. Jan Schlicht, is it something you have
cycles for?

https://mesos.apache.org/documentation/latest/reconciliation/

On Tue, Sep 15, 2015 at 4:35 PM, Qian AZ Zhang <[email protected]> wrote:

> Thanks Alex. I checked the the comments in MESOS-1127, and based on the
> last comment (see below), it seems the question is still open ...
> > For me it looks like we can deduce SlaveID from TaskID modulo we have
> to wait for transitionary slaves. If this is the case, providing just
> TaskID in Kill and Reconcile requests simplifies framework design and
> allows us to get rid of validating requests in master against {[SlaveID}}
> mismatch. Does this make sense?
>
>
> BTW, I checked the "Reconcile" message (see below) in scheduler.proto, and
> found a field "statuses" is mentioned in its comments, however, I do not
> see such field in the "Reconcile" message, so I think the comments might
> not be correct, actually it should be "tasks" field?
>   // Allows the scheduler to query the status for non-terminal tasks.
>   // This causes the master to send back the latest task status for
>   // each task in 'tasks', if possible. Tasks that are no longer known
>   // will result in a TASK_LOST update. *If 'statuses' is empty*, then
>   // the master will send the latest status for each task currently
>   // known.
>   message Reconcile {
>    // TODO(vinod): Support arbitrary queries than just state of tasks.
>     message Task {
>       required TaskID task_id = 1;
>       optional AgentID agent_id = 2;
>     }
>
>     repeated Task tasks = 1;
>   }
>
>
> Regards,
> Qian Zhang
>
> [image: Inactive hide details for Alex Rukletsov ---09/15/2015
> 20:52:26---I asked the same question some time ago and got a good explan]Alex
> Rukletsov ---09/15/2015 20:52:26---I asked the same question some time ago
> and got a good explanation from Ben Mahler. Take a look at l
>
> From: Alex Rukletsov <[email protected]>
> To: dev <[email protected]>
> Date: 09/15/2015 20:52
> Subject: Re: Why do we need slave_id in Kill message
> ------------------------------
>
>
>
> I asked the same question some time ago and got a good explanation from Ben
> Mahler. Take a look at last comments in MESOS-1127
> <https://issues.apache.org/jira/browse/MESOS-1127> and maybe even comments
> in review requests.
>
> Since the same question comes (at least) for the second time, maybe it
> makes sense to persist the answer somewhere (a comment in the protobuf).
>
> On Tue, Sep 15, 2015 at 11:55 AM, Klaus Ma <[email protected]> wrote:
>
> > I think this slave_id is used for status sync up/double check. In master,
> > it'll check whether the special slave_id is equal to task's slave id; if
> > not equal, master log message and ignore kill request.
> >
> >
> > On 2015年09月15日 17:46, Qian AZ Zhang wrote:
> >
> >> Hi,
> >>
> >> In Kill message (scheduler.proto), I found there is a slave_id field:
> >>    message Kill {
> >>      required TaskID task_id = 1;
> >>      optional SlaveID slave_id = 2;
> >>    }
> >>
> >> I am just wondering in which case framework needs to specify this field
> >> when it kills a task, I think master should know the slave id of each
> >> task,
> >> can we just use the info in master?
> >>
> >>
> >> Regards,
> >> Qian Zhang
> >>
> >
> > --
> > Klaus Ma (马达), PMP® | http://www.cguru.net
> >
> >
>
>

Reply via email to