[jira] [Comment Edited] (MESOS-7874) Convert slaveRunTaskLabelDecorator and masterRunTaskLabelDecorator to non-blocking API

Zhitao Li (JIRA) Tue, 15 Aug 2017 13:28:44 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-7874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16127771#comment-16127771
 ]


Zhitao Li edited comment on MESOS-7874 at 8/15/17 8:27 PM:
-----------------------------------------------------------

About implementation:

The change to hook.hpp, hook/manager.hpp(cpp) should be relative 
straightforward.

For changes to {{Master}} class, I took a quick look and there seemed to be two 
different paths:

* Performing unblocking hook before `Master::_accept`
    **  Pro:
        *** Can be done in parallel with authorization (the other nonblocking 
thing in operation);
        *** Simpler handling for sending messages to slave: because all things 
will be ready in {{Master::_accept}}, we can still send corresponding messages 
to slave (`CheckpointMessage` for RESERVE/UNRESERVE/..., `RunTask` or 
`RunTaskGroup` for running task/taskgroup, etc).
    ** Con:
        *** Task validation and authorization are not performed yet so hooks 
could seen tasks which never got launched
            **** technically it's always true if the agent disconnected/goes 
down, or the `send(slave->pid, message);` goes dropped. Framework are reliably 
told task status, but hooks are not delivered with it.
    ** More thoughts:
        *** Maybe we should consider creating a private helper struct on Master 
class to mutate `OfferOperation` (adding task label is only one of that), to 
facilitate further changes?
* Perform the hook inside `Master::_accept`
    ** Pro:
        *** We already know there is a pending task launching, so less code on 
this part;
    ** Con:
        *** To preserve the ordering for messages, we would need to change 
`void Master::_apply(...)` to ask it return a `Future<any message>` and cache 
it, and only send out all messages once everything is ready.

I'm inclined to go with first path, but some discussion with people more 
familiar with the large master code base is definitely welcomed.

Thanks!


was (Author: zhitao):
About implementation:

The change to hook.hpp, hook/manager.hpp(cpp) should be relative 
straightforward.

For changes to {{Master}} class, I took a quick look and there seemed to be two 
different paths:

* Performing unblocking hook before `Master::_accept`
    *  Pro:
        * Can be done in parallel with authorization (the other nonblocking 
thing in operation);
        * Simpler handling for sending messages to slave: because all things 
will be ready in {{Master::_accept}}, we can still send corresponding messages 
to slave (`CheckpointMessage` for RESERVE/UNRESERVE/..., `RunTask` or 
`RunTaskGroup` for running task/taskgroup, etc).
    * Con:
        * Task validation and authorization are not performed yet so hooks 
could seen tasks which never got launched
            * technically it's always true if the agent disconnected/goes down, 
or the `send(slave->pid, message);` goes dropped. Framework are reliably told 
task status, but hooks are not delivered with it.
    * More thoughts:
        * Maybe we should consider creating a private helper struct on Master 
class to mutate `OfferOperation` (adding task label is only one of that), to 
facilitate further changes?
* Perform the hook inside `Master::_accept`
    * Pro:
        * We already know there is a pending task launching, so less code on 
this part;
    * Con:
        * To preserve the ordering for messages, we would need to change `void 
Master::_apply(...)` to ask it return a `Future<any message>` and cache it, and 
only send out all messages once everything is ready.

I'm inclined to go with first path, but some discussion with people more 
familiar with the large master code base is definitely welcomed.

Thanks!

> Convert slaveRunTaskLabelDecorator and masterRunTaskLabelDecorator to 
> non-blocking API
> --------------------------------------------------------------------------------------
>
>                 Key: MESOS-7874
>                 URL: https://issues.apache.org/jira/browse/MESOS-7874
>             Project: Mesos
>          Issue Type: Improvement
>          Components: modules
>            Reporter: Zhitao Li
>            Assignee: Zhitao Li
>              Labels: hooks, module
>
> Our use case: we need a non-blocking way to notify our secret management 
> system during task launching sequence on agent. This mechanism needs to work 
> for both {{DockerContainerizer}} and {{MesosContainerizer}}, and both 
> {{custom executor}} and {{command executor}}, with proper access to labels on 
> {{TaskInfo}}.
> As of 1.3.0, the hooks in [hook.hpp | 
> https://github.com/apache/mesos/blob/1.3.0/include/mesos/hook.hpp] pretty 
> inconsistent on these combination cases.
> The closest option on is {{slavePreLaunchDockerTaskExecutorDecorator}}, 
> however it has a couple of problems:
> 1. For DockerContainerizer + custom executor, it strips away TaskInfo and 
> sends a `None()` instead;
> 2. This hook is not called on {{MesosContainerizer}} at all. I guess it's 
> because people can implement an {{isolator}}? However, it creates extra work 
> for module authors and operators.
> The other option is {{slaveRunTaskLabelDecorator}}, but it has own problems:
> 1. Error are silently swallowed so module cannot stop the task running 
> sequence;
> 2. It's a blocking version, which means we cannot wait for another 
> subprocess's or RPC result.
> I'm inclined to fix the two problems on 
> {{slavePreLaunchDockerTaskExecutorDecorator}}, but open to other suggestions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (MESOS-7874) Convert slaveRunTaskLabelDecorator and masterRunTaskLabelDecorator to non-blocking API

Reply via email to