[jira] Commented: (MAPREDUCE-967) TaskTracker does not need to fully unjar job jars

Vinod K V (JIRA) Mon, 07 Dec 2009 21:58:43 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787332#action_12787332
 ]


Vinod K V commented on MAPREDUCE-967:
-------------------------------------


Agreed. Even I was at a loss of how flexible we should be. Till far, AFAIK, 
this whole "job.jar's unjarring and putting in classpath" business is nowhere 
clearly documented. The only documentation I could find is (from 
mapred_tutorial) the following, which clearly is unclear about the above:
{code}
${mapred.local.dir}/taskTracker/jobcache/$jobid/jars/  : The jars directory, 
which has the job jar file
and expanded jar. The job.jar is the application's jar file that is 
automatically distributed to each machine.
It is expanded in jars directory before the tasks for the job start. The 
job.jar location is accessible to the
application through the api  JobConf.getJar() . To access the unjarred 
directory, JobConf.getJar().getParent()
can be called.
{code}
 - One take away is that along with the other changes in this JIRA we should 
definitely document clearly the points you've mentioned above w.r.t the 
classpath.
 - The second point is that your change of classpath to no more include 
jobCacheDir makes this JIRA issue an incompatible change. If we wish to 
maintain the backward compatibility, I suggest we add both job.jar as well as 
the jobCacheDir to the classpath. The reason I am nit-picky about this is 
things like these have the potential to come back and catch us unawares in the 
future.

> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>
>                 Key: MAPREDUCE-967
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt, 
> mapreduce-967.txt, mapreduce-967.txt
>
>
> In practice we have seen some users submitting job jars that consist of 
> 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning 
> up after them has a significant cost (both in wall clock and in unnecessary 
> heavy disk utilization). This cost can be easily avoided

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-967) TaskTracker does not need to fully unjar job jars

Reply via email to