[ 
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786972#action_12786972
 ] 

Todd Lipcon commented on MAPREDUCE-967:
---------------------------------------

I think it used to add the whole jobCacheDir mostly out of sloppiness. The 
options for getting classes are:
# Put it in the job.jar
# Put it in a 'classes/' directory inside the job jar
# Put it in its own jar, which is inside a lib/ directory in the job jar

I can't think of any reasons why you'd need any more options than this - the 
first and second option are already redundant. If someone has a really bizarre 
use case, they do have access to the job cache dir path through the job 
configuration, so yes, I'd recommend a custom classloader to them. Unless we 
know of an existing app that can't be satisfied by one of the above, I think 
it's better to clean up the classpath rather than continue to put the job cache 
dir on it when it seems unnecessary (at least for every job I've ever seen).

> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>
>                 Key: MAPREDUCE-967
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt, 
> mapreduce-967.txt, mapreduce-967.txt
>
>
> In practice we have seen some users submitting job jars that consist of 
> 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning 
> up after them has a significant cost (both in wall clock and in unnecessary 
> heavy disk utilization). This cost can be easily avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to