Thank you Robert. Actually, I solved the problem by creating LazyHCatInputFormat (extending HCatInputFormat), and handling the setup within getSplits(..) method call. This is working fine for me. However, your approach #2 sounds cleaner to me.
Also, for option #3, Can you please advise how the classpath and shared libs will be propagated to child mapper job (actual MR code). Oozie creating a map job to run the Tool Runnder code, which submits the job. But the configuratiosn we setin oozie java action are only applicable to ToolRunner. Regards Prashanth -----Original Message----- From: Robert Kanter [mailto:[email protected]] Sent: Friday, August 28, 2015 2:31 PM To: [email protected] Subject: Re: HCatInputFormat setup in Oozie MR action Hi, You have three options: 1) Ultimately, these methods simply set configuration properties on the job. You can look at the job.xml after running the job outside of Oozie to figure out what configs those methods set and what values. 2) Oozie actually has a feature where you can use Java code to configure the MR action by implementing an interface and putting your code there. See http://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.2.2_Configuring_the_MapReduce_action_with_Java_code 3) Run your job from the Java action. I recommend option 2. - Robert On Thu, Aug 27, 2015 at 6:18 AM, Sabbidi, Prashanth < [email protected]> wrote: > Hi, > > Can someone please advice. > > In my hadoop tool runner , I setup like this, to read hive tables in > mapper. > > HCatInputFormat.setInput(job, InputJobInfo.create("<myDatabase>", > "<myTable>", "<partitonFilter>")); > LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); > > But if I want to run my MR job from hadoop, how to setup above > settings in OOZIE MR action. >
