For complex workflows indeed Oozie(or Azkaban) is the answer.

-Bhartah

On Wed, Feb 15, 2012 at 1:29 PM, Bharath Mundlapudi <[email protected]>wrote:

> Or you could use job chaining in MR.
> http://developer.yahoo.com/hadoop/tutorial/module4.html#chaining
>
> -Bharath
>
>
> On Wed, Feb 15, 2012 at 11:26 AM, John Armstrong <[email protected]> wrote:
>
>> Actually, I think this is what Oozie is for.  It seems to leap out as a
>> great example of a forked workflow.
>>
>> hth
>>
>>
>>
>> On 02/15/2012 02:23 PM, W.P. McNeill wrote:
>>
>>> Say I have two Hadoop jobs, A and B, that can be run in parallel. I have
>>> another job, C, that takes the output of both A and B as input. I want to
>>> run A and B at the same time, wait until both have finished, and then
>>> launch C. What is the best way to do this?
>>>
>>> I know the answer if I've got a single Java client program that launches
>>> A,
>>> B, and C. But what if I don't have the option to launch all of them from
>>> a
>>> single Java program? (Say I've got a much more complicated system with
>>> many
>>> steps happening between A-B and C.) How do I synchronize between jobs,
>>> make
>>> sure there's no race conditions etc. Is this what Zookeeper is for?
>>>
>>
>>
>

Reply via email to