[jira] [Commented] (MAPREDUCE-3315) Master-Worker Application on YARN

Sharad Agarwal (JIRA) Mon, 21 May 2012 06:46:47 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280147#comment-13280147
 ]


Sharad Agarwal commented on MAPREDUCE-3315:
-------------------------------------------

Thanks Nikhil. Overall a good start. Needs some changes primarily to make the 
framework scalable and fault tolerant. 

Specific comments as follows:
- MWProtocol: Would be better to have a heartbeat kind of interface between 
worker and master via which worker sends the status reports and results of 
completed workunits as HeartbeatRequest and gets more WorkUnits and 
instructions as HeartbeatResponse. Using MWMessage for passing workunit and 
result is confusing. workunit and result could be just Writable objects. 
special instructions like kill should be given via separate Action commands in 
the HeartbeatResponse.

- MWWorkerRunner:
If unable to contact Master for sometime, then worker should do a suicide
- MWMasterRunner:
if doesn't receive the heartbeat from worker for certain time period, then mark 
the worker as killed and launch a new worker. (The assumption is that doWork in 
is idempotent).

- AMRMProtocolWraper:
requestContainer() is currently blocking. This will have a high startup cost if 
number of workers are high.

- MWApplicationMaster: 
addWorker: is blocking. Requests container and container launch is sequential. 
Will have high worker startup cost. Should be done via a thread pool for 
parallel launches. See ContainerLauncher in MR application master.
For each worker launch, a new ContainerManagerWrapper thread is created. This 
is not scalable.

- Extend org.apache.hadoop.yarn.service.AbstractService or CompositeService for 
all moving components

- Add code comments. Javadocs to public and protected methods.

- Add unit tests.

- Dynamic worker pool: minWorker/maxWorkers. Needs client protocol say 
MWClientProtocol to see the status of the overall application. Potentially 
submit new workunits, kill workers, add workers etc.



                
> Master-Worker Application on YARN
> ---------------------------------
>
>                 Key: MAPREDUCE-3315
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3315
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Sharad Agarwal
>            Assignee: Sharad Agarwal
>             Fix For: 0.24.0
>
>         Attachments: MAPREDUCE-3315-1.patch, MAPREDUCE-3315-2.patch, 
> MAPREDUCE-3315-3.patch, MAPREDUCE-3315.patch
>
>
> Currently master worker scenarios are forced fit into Map-Reduce. Now with 
> YARN, these can be first class and would benefit real/near realtime workloads 
> and be more effective in using the cluster resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3315) Master-Worker Application on YARN

Reply via email to