hey sean,

i later learned that the method i originally posted (configuring different JobConfs and then running them, blocking style, with JobClient.runJob(conf)) was sufficient for my needs. the reason it was failing before was somehow my fault and the bugs somehow got fixed x_X.

Lukas gave me a helpful reply pointing me to TestJobControl.java (in the hadoop source directory). it seems like this would be helpful if your job dependencies are complex. but for me, i just need to do one job after another (and every job only depends on the one right before it), so the code i originally posted works fine.
On Jul 14, 2008, at 1:38 PM, Sean Arietta wrote:


Could you please provide some small code snippets elaborating on how you implemented that? I have a similar need as the author of this thread and I
would appreciate any help. Thanks!

Cheers,
Sean


Joman Chu-2 wrote:

Hi, I use Toolrunner.run() for multiple MapReduce jobs. It seems to work well. I've run sequences involving hundreds of MapReduce jobs in a for
loop and it hasn't died on me yet.

On Wed, July 9, 2008 4:28 pm, Mori Bellamy said:
Hey all, I'm trying to chain multiple mapreduce jobs together to
accomplish a complex task. I believe that the way to do it is as follows:

JobConf conf = new JobConf(getConf(), MyClass.class); //configure job....
set mappers, reducers, etc
SequenceFileOutputFormat.setOutputPath(conf,myPath1);
JobClient.runJob(conf);

//new job JobConf conf2 = new JobConf(getConf(),MyClass.class)
SequenceFileInputFormat.setInputPath(conf,myPath1); //more
configuration... JobClient.runJob(conf2)

Is this the canonical way to chain jobs? I'm having some trouble with
this
method -- for especially long jobs, the latter MR tasks sometimes do not
start up.




--
Joman Chu
AIM: ARcanUSNUMquam
IRC: irc.liquid-silver.net




--
View this message in context: 
http://www.nabble.com/How-to-chain-multiple-hadoop-jobs--tp18370089p18452309.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.


Reply via email to