Re: Merging MyriadExecutor with NodeManager

Adam Bordelon Thu, 16 Jul 2015 23:24:12 -0700

Thanks for the clear explanation, Swapnil. This sounds awesome. A much
cleaner design, and it seems like exactly the kind of thing YARN
AuxServices were created for.


On Thu, Jul 16, 2015 at 5:24 PM, Swapnil Daingade <
[email protected]> wrote:

> Hi All,
>
> Currently with Fine Grained Scheduling (FGS), the workflow for reporting
> status and relinquishing resources used
> by a YARN container is as following
>
> 1. The NodeManager reports the status/completion of the container to the
> ResourceManager
>      as part of container statuses included in the NM to RM heartbeat
>
> 2. This container status is intercepted by the Myriad Scheduler. The
> Scheduler sends a
>     frameworkMessage to the MyriadExecutor running on the NodeManager node.
>     See NMHeartBeatHandler.handleStatusUpdate here
>
>
> https://github.com/mesos/myriad/blob/issue_14/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/NMHeartBeatHandler.java#L112
>
> 3. This frameworkMessage instructs the MyriadExecutor to report the task
> state corresponding to the YARN container status back to mesos.
>      See MyriadExecutor.frameworkMessage here
>
>
> https://github.com/mesos/myriad/blob/issue_14/myriad-executor/src/main/java/com/ebay/myriad/executor/MyriadExecutor.java#L252
>
> There are some disadvantages to this approach
>
> 1. In step 2 we use SchedulerDriver.sendFrameworkMessage() API. According
> to the API documentation, this message is best effort.
>   /**
>     * Sends a message from the framework to one of its executors. These
>     * messages are best effort; do not expect a framework message to be
>     * retransmitted in any reliable fashion.
>
> 2. This requires the Scheduler/RM to be up for YARN containers/Mesos Tasks
> to be able to report statuses to Mesos Master.
>      If Scheduler/RM goes down, we will not be able to send task statuses
> to Mesos, until the Scheduler/RM is back up.
>      This can lead to resource leakages.
>
> 3. There is additional overhead of sending messages back from Scheduler/RM
> back to the Executors for each container on each
>      heartbeat. (Number of yarn containers/node * Number of Nodes)
> additional messages.
>
> In order to avoid the above mentioned issues, we are proposing merging of
> the MyriadExecutor and NodeManager.
> The MyriadExecutor will run as a NM auxiliary service (same process as NM).
> It will be able to intercept YARN container completion locally and inform
> mesos-master irrespective of weather scheduler is running.
> We will no longer have to use the sendFrameworkMessage method.
> There will be less message traffic from scheduler to executor.
>
> I have posted my proposed changes as part of the pull request here
> https://github.com/mesos/myriad/pull/118
>
> Request you take a look and let me know your feedback.
>
> Regards
> Swapnil
>

Re: Merging MyriadExecutor with NodeManager

Reply via email to