Thank you Robert.

Actually, I solved the problem by creating LazyHCatInputFormat (extending 
HCatInputFormat), and handling the setup within getSplits(..) method call.
This is working fine for me. However, your approach #2 sounds cleaner to me.

Also, for option #3, Can you please advise how the classpath and shared libs 
will be propagated to child mapper job (actual MR code).
Oozie creating a map job to run the Tool Runnder code, which submits the job. 
But the configuratiosn we setin oozie java action are only applicable to 
ToolRunner.

Regards
Prashanth

-----Original Message-----
From: Robert Kanter [mailto:[email protected]] 
Sent: Friday, August 28, 2015 2:31 PM
To: [email protected]
Subject: Re: HCatInputFormat setup in Oozie MR action

Hi,

You have three options:

1) Ultimately, these methods simply set configuration properties on the job.  
You can look at the job.xml after running the job outside of Oozie to figure 
out what configs those methods set and what values.

2) Oozie actually has a feature where you can use Java code to configure the MR 
action by implementing an interface and putting your code there.
See
http://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.2.2_Configuring_the_MapReduce_action_with_Java_code

3) Run your job from the Java action.

I recommend option 2.

- Robert


On Thu, Aug 27, 2015 at 6:18 AM, Sabbidi, Prashanth < 
[email protected]> wrote:

> Hi,
>
> Can someone please advice.
>
> In my hadoop tool runner , I setup like this, to read hive tables in 
> mapper.
>
> HCatInputFormat.setInput(job, InputJobInfo.create("<myDatabase>",
> "<myTable>", "<partitonFilter>"));
> LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
>
> But if I want to run my MR job from hadoop, how to setup above 
> settings in OOZIE MR action.
>

Reply via email to