[ https://issues.apache.org/jira/browse/HADOOP-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12642062#action_12642062 ]
Amar Kamat commented on HADOOP-4472: ------------------------------------ +1 > Should we move out the creation of setup/cleanup tasks from > JobInProgress.initTasks()? > --------------------------------------------------------------------------------------- > > Key: HADOOP-4472 > URL: https://issues.apache.org/jira/browse/HADOOP-4472 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Reporter: Vivek Ratan > > JobInProgress.initTasks() creates TIPs for map and reduce tasks, and also the > newly-introduced setup and cleanup tasks. initTasks() is called by the > schedulers, as for reasons of memory optimizations, schedulers may choose to > initialize M/R tasks at various moments (the Capacity Scheduler, for example, > calls initTasks() just when it considers a job for running). One can say that > Schedulers 'own' the initialization of M/R tasks in a job. Furthermore the JT > 'owns' the setup and cleanup tasks (it schedules them, and Schedulers are > unaware of these tasks). This causes a problematic dependency between the JT > and a Scheduler. For example, the Capacity Scheduler calls initTasks() and > immediately calls JobInProgress.obtainNewMapTask for a map task. This is a > problem today, because we cannot run any map or reduce tasks before the setup > task is run, which the Capacity Scheduler is not aware of. > Either all Schedulers are explicitly aware of setup/cleanup tasks and their > dependencies with M/R tasks (in which case, Schedulers 'own' the creation and > scheduling of all these tasks correctly), or the JT 'owns' the setup/cleanup > tasks and Schedulers are completely unaware of them (in which case, the > creation of setup/cleanup tasks must be moved out of initTasks into a > separate method which is called by the JT). > I think the latter is the right way to go (unless we implement HADOOP-4421, > in which case the former option may be viable as well). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.