[
https://issues.apache.org/jira/browse/HADOOP-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628574#action_12628574
]
Amar Kamat commented on HADOOP-4018:
------------------------------------
Few comments
1) {{JobInProgress.totalAllocatedTasks()}} should also consider
{{nonLocalMaps}} and {{nonLocalRunningMaps}}. Applications like random-writer
use these structures.
2) There is an extra '-' diff in {{JobTracker.java}}
3) You might need to synchronize {{totalAllocatedTasks()}} api in both places.
Consider a case where job1 is in init stage while job2 is newly submitted.
Assume both cannot run in parallel on the jobtracker. Assume job1 is not yet
seen the splits and job2, which is getting constructed, checks for
totalAllocatedTasks(). In such a case job2 will succeed and will move for init.
job1 will create its cache and continue while job2 will fail in init. Also
there is no guarantee for inits to be sequential. They might happen in parallel
as its upto the scheduler. So now all the jobs that call totalTasks might see
some stale value and hence might end up expanding themselves.
4) Plz add a test case.
> limit memory usage in jobtracker
> --------------------------------
>
> Key: HADOOP-4018
> URL: https://issues.apache.org/jira/browse/HADOOP-4018
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: maxSplits.patch, maxSplits2.patch, maxSplits3.patch,
> maxSplits4.patch, maxSplits5.patch, maxSplits6.patch
>
>
> We have seen instances when a user submitted a job with many thousands of
> mappers. The JobTracker was running with 3GB heap, but it was still not
> enough to prevent memory trashing from Garbage collection; effectively the
> Job Tracker was not able to serve jobs and had to be restarted.
> One simple proposal would be to limit the maximum number of tasks per job.
> This can be a configurable parameter. Is there other things that eat huge
> globs of memory in job Tracker?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.