[
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786972#action_12786972
]
Todd Lipcon commented on MAPREDUCE-967:
---------------------------------------
I think it used to add the whole jobCacheDir mostly out of sloppiness. The
options for getting classes are:
# Put it in the job.jar
# Put it in a 'classes/' directory inside the job jar
# Put it in its own jar, which is inside a lib/ directory in the job jar
I can't think of any reasons why you'd need any more options than this - the
first and second option are already redundant. If someone has a really bizarre
use case, they do have access to the job cache dir path through the job
configuration, so yes, I'd recommend a custom classloader to them. Unless we
know of an existing app that can't be satisfied by one of the above, I think
it's better to clean up the classpath rather than continue to put the job cache
dir on it when it seems unnecessary (at least for every job I've ever seen).
> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>
> Key: MAPREDUCE-967
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt,
> mapreduce-967.txt, mapreduce-967.txt
>
>
> In practice we have seen some users submitting job jars that consist of
> 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning
> up after them has a significant cost (both in wall clock and in unnecessary
> heavy disk utilization). This cost can be easily avoided
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.