[ https://issues.apache.org/jira/browse/MAPREDUCE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13234229#comment-13234229 ]
Nikhil S. Ketkar commented on MAPREDUCE-3315: --------------------------------------------- I have started some work on this issue and I wanted to describe the overall approach I am taking and get some feedback. Let me start by describing how an application will be written using Master-Worker. The pseudocode below illustrates the usage. {code} class WordCount { // Word count implementation using Master-Worker paradigm. // Master sends portions of the input file to workers // Worker builds a word count hash map (word -> count) and sends it as a result unit // Master uses the results to update master dictionary class WorkUnit { // A list of words } class ResultUnit { // Map of word -> count } class Master extends MasterRunner<WorkUnit, ResultUnit> { public manageWorkers() { // Spawn a bunch of workers using spawnNewWorker(); // For each spawned worker spawnNewWorker() will return a WorkerReference, keep it around for bookkeeping // Read a file and create WorkUnits with a fixed number of lines to each workers // Assign WorkUnits to Workers using assignWork(), use the previously obtained WorkerReference // Wait for ResultUnits using waitForResult(); // Update master dictionary based on a received result unit // After work is done, kill workers using terminateWorker() } } class Worker extends WorkerRunner<WorkUnit, ResultUnit> { public ResultUnit doWork(WorkUnit wu) { // Build a word count hash map (word -> count) and sends it as a result unit } } } // User writes code to setup and launch job public static void main(String[] args) throws Exception { MWJobConf conf = new MWJobConf(WordCount.class); conf.setJobName("WordCount"); conf.setMasterClass(Master.class); conf.setWorkerClass(Worker.class); conf.setWorkUnitClass(WorkUnit.class); conf.setResultUnitClass(ResultUnit.class); MWJobClient.runJob(conf); } } {code} Key functionality for the Master Worker will be implemented in the following classes. {code} class MasterRunner<W, R> { // Spawn new Worker protected WorkerReference spawnNewWorker(); // Spawn new Workers protected ArrayList<WorkerReference> spawnNewWorkers(); // Assign work to any Worker, returns WorkerReference to whom it was assigned protected WorkerReference assignWork(WorkUnit wu); // Assign work to a specific Worker protected void assignWork(WorkerReference wf, WorkUnit wu); // Wait for result from any Worker. Blocking Call protected ResultUnit waitForResult(); // Wait for result from a specific worker. Blocking Call protected ResultUnit waitForResult(WorkerReference wf); // Is this specific worker alive? Blocking Call protected boolean isWorkerAlive(WorkerReference wf); // Get alive workers? Blocking Call protected ArrayList<WorkerReference> isWorkerAlive(WorkerReference wf); // Terminate a specific Worker protected void terminateWorker(WorkerReference wf); // To be implemented by user public manageWorkers() = 0; } class WorkerRunner<W, R> { // To be implemented by user public ResultUnit doWork(WorkUnit wu) = 0; } class WorkUnitContainer<W> { // The framework passes around WorkUnitContainers which contain the user defined // WorkUnit and some additional bookkeeping information } class ResultUnitContainer<R> { // The framework passes around ResultUnitContainers which contain the user defined // ResultUnit and some additional bookkeeping information } class WorkerReference { // Uniquely identifies the Workers } {code} A few questions I have been thinking about are: # Should I use Hadoop IPC or RMI or something else? # Should the Master be "in" the ApplicationManager or be run as a Container? > Master-Worker Application on YARN > --------------------------------- > > Key: MAPREDUCE-3315 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3315 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Sharad Agarwal > Assignee: Sharad Agarwal > Fix For: 0.24.0 > > > Currently master worker scenarios are forced fit into Map-Reduce. Now with > YARN, these can be first class and would benefit real/near realtime workloads > and be more effective in using the cluster resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira