Should work with 2.5 as earlier (I tested on single node mapr cluster with hadoop-common 2.5.1).
Only change I did was to copy the myriad executor jars under $PROJECT_HOME/myriad-executor/build/libs/ to $YARN_HOME/share/hadoop/yarn/lib as the MyriadExecutor now runs as part of NM. Actually let me spin up a NM only node and make sure all the dependencies for MyriadExecutor are satisfied (I tested on a node that had both RM and NM). Regards Swapnil On Fri, Jul 17, 2015 at 5:27 AM, Darin Johnson <[email protected]> wrote: > Awesome took a quick look, will test and go over code soon. Any dependency > issues between 2.5, 2.6 and 2.7 to be aware of? > Hi All, > > Currently with Fine Grained Scheduling (FGS), the workflow for reporting > status and relinquishing resources used > by a YARN container is as following > > 1. The NodeManager reports the status/completion of the container to the > ResourceManager > as part of container statuses included in the NM to RM heartbeat > > 2. This container status is intercepted by the Myriad Scheduler. The > Scheduler sends a > frameworkMessage to the MyriadExecutor running on the NodeManager node. > See NMHeartBeatHandler.handleStatusUpdate here > > > https://github.com/mesos/myriad/blob/issue_14/myriad-scheduler/src/main/java/com/ebay/myriad/scheduler/NMHeartBeatHandler.java#L112 > > 3. This frameworkMessage instructs the MyriadExecutor to report the task > state corresponding to the YARN container status back to mesos. > See MyriadExecutor.frameworkMessage here > > > https://github.com/mesos/myriad/blob/issue_14/myriad-executor/src/main/java/com/ebay/myriad/executor/MyriadExecutor.java#L252 > > There are some disadvantages to this approach > > 1. In step 2 we use SchedulerDriver.sendFrameworkMessage() API. According > to the API documentation, this message is best effort. > /** > * Sends a message from the framework to one of its executors. These > * messages are best effort; do not expect a framework message to be > * retransmitted in any reliable fashion. > > 2. This requires the Scheduler/RM to be up for YARN containers/Mesos Tasks > to be able to report statuses to Mesos Master. > If Scheduler/RM goes down, we will not be able to send task statuses > to Mesos, until the Scheduler/RM is back up. > This can lead to resource leakages. > > 3. There is additional overhead of sending messages back from Scheduler/RM > back to the Executors for each container on each > heartbeat. (Number of yarn containers/node * Number of Nodes) > additional messages. > > In order to avoid the above mentioned issues, we are proposing merging of > the MyriadExecutor and NodeManager. > The MyriadExecutor will run as a NM auxiliary service (same process as NM). > It will be able to intercept YARN container completion locally and inform > mesos-master irrespective of weather scheduler is running. > We will no longer have to use the sendFrameworkMessage method. > There will be less message traffic from scheduler to executor. > > I have posted my proposed changes as part of the pull request here > https://github.com/mesos/myriad/pull/118 > > Request you take a look and let me know your feedback. > > Regards > Swapnil >
