hey sean,
i later learned that the method i originally posted (configuring
different JobConfs and then running them, blocking style, with
JobClient.runJob(conf)) was sufficient for my needs. the reason it was
failing before was somehow my fault and the bugs somehow got fixed x_X.
Lukas gave me a helpful reply pointing me to TestJobControl.java (in
the hadoop source directory). it seems like this would be helpful if
your job dependencies are complex. but for me, i just need to do one
job after another (and every job only depends on the one right before
it), so the code i originally posted works fine.
On Jul 14, 2008, at 1:38 PM, Sean Arietta wrote:
Could you please provide some small code snippets elaborating on how
you
implemented that? I have a similar need as the author of this thread
and I
would appreciate any help. Thanks!
Cheers,
Sean
Joman Chu-2 wrote:
Hi, I use Toolrunner.run() for multiple MapReduce jobs. It seems to
work
well. I've run sequences involving hundreds of MapReduce jobs in a
for
loop and it hasn't died on me yet.
On Wed, July 9, 2008 4:28 pm, Mori Bellamy said:
Hey all, I'm trying to chain multiple mapreduce jobs together to
accomplish a complex task. I believe that the way to do it is as
follows:
JobConf conf = new JobConf(getConf(), MyClass.class); //configure
job....
set mappers, reducers, etc
SequenceFileOutputFormat.setOutputPath(conf,myPath1);
JobClient.runJob(conf);
//new job JobConf conf2 = new JobConf(getConf(),MyClass.class)
SequenceFileInputFormat.setInputPath(conf,myPath1); //more
configuration... JobClient.runJob(conf2)
Is this the canonical way to chain jobs? I'm having some trouble
with
this
method -- for especially long jobs, the latter MR tasks sometimes
do not
start up.
--
Joman Chu
AIM: ARcanUSNUMquam
IRC: irc.liquid-silver.net
--
View this message in context:
http://www.nabble.com/How-to-chain-multiple-hadoop-jobs--tp18370089p18452309.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.