[ 
https://issues.apache.org/jira/browse/HADOOP-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625505#action_12625505
 ] 

dhruba borthakur commented on HADOOP-4018:
------------------------------------------

Hi devaraj,  thanks for the info. The problem we saw was with 0.17.1. The 
number of completed jobs in memory has been reduced from the default of 100 to 
20. When the problem occured there were about 400 total jobs in the JT (running 
+ completed + failed). I do not know how many jobs were run by the JobTracker 
since it was last started. This particular job had 60,000 mappers. About half 
of these tasks had finished before the problem started being acute.

I have the GC log enabled via -verbose:gc -XX:+PrintGCTimeStamps 
-XX:+PrintGCDetails -Xloggc:/var/hadoop/logs/jobtracker1.gc.log

This log (as well as jconsole) showed that JT was busy running full GC. 

I agree that moving to later releases till help mitigate this problem to a 
certain extent, but as a system adminstrator, I would like to set the upper 
limit of number of tasks for a single job. This is not a cure-all but could 
serve as a guardpost to prevent non-conforming jobs form running. This should 
be completary to all the JIRAs you mentioned, isn't it? Do you see a downside 
in this approach?




> limit memory usage in jobtracker
> --------------------------------
>
>                 Key: HADOOP-4018
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4018
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>
> We have seen instances when a user submitted a job with many thousands of 
> mappers. The JobTracker was running with 3GB heap, but it was still not 
> enough to prevent memory trashing from Garbage collection; effectively the 
> Job Tracker was not able to serve jobs and had to be restarted.
> One simple proposal would be to limit the maximum number of tasks per job. 
> This can be a configurable parameter. Is there other things that eat huge 
> globs of memory in job Tracker?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to