Re: chaining jobs

Amareshwari Sri Ramadasu Sun, 13 Dec 2009 21:22:01 -0800

You can use the utility JobControl for doing so.
More info @
http://hadoop.apache.org/common/docs/r0.20.0/api/org/apache/hadoop/mapred/jobcontrol/JobControl.html


Thanks
Amareshwri

On 12/12/09 12:14 AM, "Mike Kendall" <[email protected]> wrote:

make a runner that has a bunch of hadoop jobs in one bash file...
that'll work for a little while.

if you find yourself doing multistep jobs all the time, you're
probably going to want to write a library or framework for doing this
kind of stuff.  even saving yourself the -file map.py -map map.py with
-m map.py will save you a lot of headaches...

On Fri, Dec 11, 2009 at 10:25 AM, Prakhar Sharma
<[email protected]> wrote:
> Hi all,
> I am using Hadoop's Pipes API in a C++ code. I need to make successive
> runTask() calls, i.e., I need to do chaining of Map -> Reduce -> Map
> -> Reduce.
> In between two successive invocations I need to set new values for
> some of the jobconf's parameters, like mapred.input.dir,
> mapred.output.dir. Can any one share some ideas, first hand experience
> in how to do this using Pipes interface ?
>
> Any pointers, advice is highly appreciated.
>
> Thanks,
> Prakhar
>

Re: chaining jobs

Reply via email to