Provide a way to use 'uber' jars with Oozie MR actions
------------------------------------------------------

                 Key: OOZIE-654
                 URL: https://issues.apache.org/jira/browse/OOZIE-654
             Project: Oozie
          Issue Type: Improvement
            Reporter: Harsh J
            Assignee: Harsh J
            Priority: Minor


Right now, say if you have a custom MR code in a jar that has a {{lib/}} folder 
inside which carries more dependent jars (a structure known as 'uber' jars), 
and you submit the job via a regular 'hadoop jar' command, these lib/*.jars get 
picked up by the framework because the supplied jar is specified explicitly via 
conf.setJarByClass or conf.setJar. That is, if this user uber jar goes to the 
JT as the mapred.jar, then  it is handled by the framework properly and the 
lib/*.jars are all considered and placed on the classpath.

Distributed cache jars do not have this effect, and that is cause the MR 
framework does not consider them as uber jars and does not extract and use 
their internal lib/ directories.

We should have a way in oozie to let users promote one of their jars as uber 
jars, as an option.

Proposal: Have an optional oozie-prefixed config, or an optional element in the 
MR action XML, that lets user specify what class should be loaded to be set as 
setJarByClass(...). This will have to be a class available in the higher level 
of the uber jar (not under lib/) but can be any class inside the targeted jar 
really (just not from a jar under lib/). We then set this as 
jobConf.setJarByClass(loadedCls), and then run the job.

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to