[
https://issues.apache.org/jira/browse/MAPREDUCE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon updated MAPREDUCE-967:
----------------------------------
Attachment: mapreduce-967.txt
bq. we should definitely document clearly the points you've mentioned above
w.r.t the classpath.
You're totally right, and I actually did this and forgot to upload the patch!
My bad. Here's a new one.
bq. makes this JIRA issue an incompatible change
Yes, this is technically incompatible. But I think it's not a problem for the
following reasons:
- Since job.jar is itself added to the classpath, the standard classloader will
pick up anything inside job.jar just as if it were expanded and the resulting
dir were put on the classpath
- The only other people this should break are those who are using java.io (or
other non-classpath-related access methods) to access things unpacked from the
jar. The new configuration parameter is a suitable workaround for them (as
demonstrated by Streaming). In this case, what's on the classpath doesn't
matter since they're not using a ClassLoader anyhow.
- Non-java applications are the only ones for whom the above two points don't
apply, but non-Java applications don't have any concept of classpath and
therefore it shouldn't be a problem.
Philosophically, isn't pre-1.0 exactly when we should be making these minor
incompatible changes for the purposes of code cleanliness? Compared to the
other drastic changes we're putting in 22, this is hardly a showstopper. I
don't see anything *against* the change you're requesting, except that I think
we should do everything in our power now to clean up the code before we call
Hadoop 1.0. If I'm the only one with this philosophy, I'll acquiesce, but I
think the sloppy classpath is just as likely to come back to bite us as fixing
it.
> TaskTracker does not need to fully unjar job jars
> -------------------------------------------------
>
> Key: MAPREDUCE-967
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-967
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: mapreduce-967-branch-0.20.txt, mapreduce-967.txt,
> mapreduce-967.txt, mapreduce-967.txt, mapreduce-967.txt
>
>
> In practice we have seen some users submitting job jars that consist of
> 10,000+ classes. Unpacking these jars into mapred.local.dir and then cleaning
> up after them has a significant cost (both in wall clock and in unnecessary
> heavy disk utilization). This cost can be easily avoided
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.