Or you could use job chaining in MR. http://developer.yahoo.com/hadoop/tutorial/module4.html#chaining
-Bharath On Wed, Feb 15, 2012 at 11:26 AM, John Armstrong <[email protected]> wrote: > Actually, I think this is what Oozie is for. It seems to leap out as a > great example of a forked workflow. > > hth > > > > On 02/15/2012 02:23 PM, W.P. McNeill wrote: > >> Say I have two Hadoop jobs, A and B, that can be run in parallel. I have >> another job, C, that takes the output of both A and B as input. I want to >> run A and B at the same time, wait until both have finished, and then >> launch C. What is the best way to do this? >> >> I know the answer if I've got a single Java client program that launches >> A, >> B, and C. But what if I don't have the option to launch all of them from a >> single Java program? (Say I've got a much more complicated system with >> many >> steps happening between A-B and C.) How do I synchronize between jobs, >> make >> sure there's no race conditions etc. Is this what Zookeeper is for? >> > >
