[
https://issues.apache.org/jira/browse/MAPREDUCE-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016902#comment-13016902
]
Hudson commented on MAPREDUCE-1783:
-----------------------------------
Integrated in Hadoop-Mapreduce-trunk #643 (See
[https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/])
> Task Initialization should be delayed till when a job can be run
> ----------------------------------------------------------------
>
> Key: MAPREDUCE-1783
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1783
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: contrib/fair-share
> Affects Versions: 0.20.1
> Reporter: Ramkumar Vadali
> Assignee: Ramkumar Vadali
> Fix For: 0.22.0, 0.23.0
>
> Attachments: 0001-Pool-aware-job-initialization.patch,
> 0001-Pool-aware-job-initialization.patch.1, MAPREDUCE-1783.patch,
> submit-mapreduce-1783.patch
>
>
> The FairScheduler task scheduler uses PoolManager to impose limits on the
> number of jobs that can be running at a given time. However, jobs that are
> submitted are initiaiized immediately by EagerTaskInitializationListener by
> calling JobInProgress.initTasks. This causes the job split file to be read
> into memory. The split information is not needed until the number of running
> jobs is less than the maximum specified. If the amount of split information
> is large, this leads to unnecessary memory pressure on the Job Tracker.
> To ease memory pressure, FairScheduler can use another implementation of
> JobInProgressListener that is aware of PoolManager limits and can delay task
> initialization until the number of running jobs is below the maximum.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira