[
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787332#action_12787332
]
Vinod K V commented on MAPREDUCE-967:
-------------------------------------
Agreed. Even I was at a loss of how flexible we should be. Till far, AFAIK,
this whole "job.jar's unjarring and putting in classpath" business is nowhere
clearly documented. The only documentation I could find is (from
mapred_tutorial) the following, which clearly is unclear about the above:
{code}
${mapred.local.dir}/taskTracker/jobcache/$jobid/jars/ : The jars directory,
which has the job jar file
and expanded jar. The job.jar is the application's jar file that is
automatically distributed to each machine.
It is expanded in jars directory before the tasks for the job start. The
job.jar location is accessible to the
application through the api JobConf.getJar() . To access the unjarred
directory, JobConf.getJar().getParent()
can be called.
{code}
- One take away is that along with the other changes in this JIRA we should
definitely document clearly the points you've mentioned above w.r.t the
classpath.
- The second point is that your change of classpath to no more include
jobCacheDir makes this JIRA issue an incompatible change. If we wish to
maintain the backward compatibility, I suggest we add both job.jar as well as
the jobCacheDir to the classpath. The reason I am nit-picky about this is
things like these have the potential to come back and catch us unawares in the
future.
> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>
> Key: MAPREDUCE-967
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt,
> mapreduce-967.txt, mapreduce-967.txt
>
>
> In practice we have seen some users submitting job jars that consist of
> 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning
> up after them has a significant cost (both in wall clock and in unnecessary
> heavy disk utilization). This cost can be easily avoided
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.