[ 
https://issues.apache.org/jira/browse/MAPREDUCE-841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740858#action_12740858
 ] 

Hong Tang commented on MAPREDUCE-841:
-------------------------------------

For JobConf, it becomes a bit hard to determine the subset of properties used 
by JobTracker. I scanned through JobTracker.java, and here is the list so far:
- "hadoop.job.ugi": user/group info.
- "job.end.retry.attempts" / "job.end.retry.interval": for job end notification
- "mapred.job.name": job name
- "hadoop.job.history.user.location" / "mapred.output.dir": for job history log 
file location.
- "fs.default.name" / "fs.*.impl" / "fs.automatic.close": file system related 
stuff, also for placing the job history log to the right place as specified by 
user.
- "user.name": user name
- various memory related knobs.
- "mapred.map.tasks" / "mapred.reduce.tasks": user desired # of map/reduce tasks

As we can see, (1) the list of properties needed by JT is not much, and it'd be 
better if we not load the complete JobConf object for each job. (2) this is a 
pretty diverged list of properties. Maintaining such a list in synchrony with 
JobTracker code is a hard problem.


> Protect Job Tracker against memory exhaustion due to very large InputSplit or 
> JobConf objects
> ---------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-841
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-841
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.20.1
>            Reporter: Hong Tang
>             Fix For: 0.21.0
>
>
> JobTracker only needs to examine a subset of information contained by 
> InputSplit or JobConf objects. But currently JobTracker loads the complete 
> user-defined InputSplit and JobConf objects in memory. This design would 
> leave JobTracker susceptible to memory exhaustion particularly in cases when 
> some bugs in user code which could result in very large input splits or job 
> conf objects (e.g. PIG-901).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to