[
https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538245
]
Dennis Kubes commented on HADOOP-1622:
--------------------------------------
1. Could you please remove the mention of 'final' and 'default' config
resources from the javadoc for JobConf.{get|set}JobResources? They are no
longer relevant vis-a-vis hadoop Configuration.
I have removed the mention of final and default resources.
2. Should we also have a JobConf.setJobResource along with
JobConf.addJobResource, ala {{DistributedCache} apis?
I had debated about set vs add resources. The current behavior is when you add
a resource you are appending it to a list of resources as opposed to setting a
resource which would clear anything previously added and add only that
resource. Since many times jar resources are added by including the jar file
which contains a given class, I thought it better to NOT allow clearing and
resetting of job resources.
3. Should we move the private JobClient.createJobJar method to JarUtils to make
it available as a useful utility?
I debated about this too. JarUtils was generic jaring and unjaring utilities.
But I don't see harm in putting createJobJar in and I think you are right we
may need that somewhere else in the future. I have remvoed from JobClient and
added to JarUtils.
Unrelated: Does it make sense to rename Configuration.addResource to
Configuration.addConfigResource? I wonder how confusing these unrelated api
names are, given JobConf is a Configuration to
Yeah, debated about this one too. In the end we weren't just adding jars but
multiple things such as classes, exe, files. Couldn't find a better name for
that then resource. I put it as jobResource to be a little less confusing.
Changing Configuration over to configResource would be good I think, Although
we should probably deprecate because a lot of things rely on that method.
I am currently testing patch 9, will have it posted shortly.
> Hadoop should provide a way to allow the user to specify jar file(s) the user
> job depends on
> --------------------------------------------------------------------------------------------
>
> Key: HADOOP-1622
> URL: https://issues.apache.org/jira/browse/HADOOP-1622
> Project: Hadoop
> Issue Type: Improvement
> Reporter: Runping Qi
> Assignee: Dennis Kubes
> Fix For: 0.16.0
>
> Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch,
> HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch,
> multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch
>
>
> More likely than not, a user's job may depend on multiple jars.
> Right now, when submitting a job through bin/hadoop, there is no way for the
> user to specify that.
> A walk around for that is to re-package all the dependent jars into a new jar
> or put the dependent jar files in the lib dir of the new jar.
> This walk around causes unnecessary inconvenience to the user. Furthermore,
> if the user does not own the main function
> (like the case when the user uses Aggregate, or datajoin, streaming), the
> user has to re-package those system jar files too.
> It is much desired that hadoop provides a clean and simple way for the user
> to specify a list of dependent jar files at the time
> of job submission. Someting like:
> bin/hadoop .... --depending_jars j1.jar:j2.jar
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.