Hi, I don't have the code sitting in front of me at the moment, but
I'll do some of it from memory and I'll post a real snippet tomorrow
night. Hopefully, this can get you started

public class MyMainClass {
        public static void main(String[] args) {
                ToolRunner.run(new Configuration(), new 
ClassThatImplementsTool(), args);
                //make sure you see the API for other trickiness you can do.
        }
}

public class ClassThatImplementsTool implements Tool {
        public int run(String[] args) {
                //this method gets called by ToolRunner.run
                //do all sorts of configuration here
                //ie, set your Map, Combine, Reduce class
                //look at the Configuration class API
        }
}

The main think to know is that the ToolRunner.run() will call your
class's run() method.

Joman Chu
AIM: ARcanUSNUMquam
IRC: irc.liquid-silver.net


On Mon, Jul 14, 2008 at 4:38 PM, Sean Arietta <[EMAIL PROTECTED]> wrote:
>
> Could you please provide some small code snippets elaborating on how you
> implemented that? I have a similar need as the author of this thread and I
> would appreciate any help. Thanks!
>
> Cheers,
> Sean
>
>
> Joman Chu-2 wrote:
>>
>> Hi, I use Toolrunner.run() for multiple MapReduce jobs. It seems to work
>> well. I've run sequences involving hundreds of MapReduce jobs in a for
>> loop and it hasn't died on me yet.
>>
>> On Wed, July 9, 2008 4:28 pm, Mori Bellamy said:
>>> Hey all, I'm trying to chain multiple mapreduce jobs together to
>>> accomplish a complex task. I believe that the way to do it is as follows:
>>>
>>> JobConf conf = new JobConf(getConf(), MyClass.class); //configure job....
>>> set mappers, reducers, etc
>>> SequenceFileOutputFormat.setOutputPath(conf,myPath1);
>>> JobClient.runJob(conf);
>>>
>>> //new job JobConf conf2 = new JobConf(getConf(),MyClass.class)
>>> SequenceFileInputFormat.setInputPath(conf,myPath1); //more
>>> configuration... JobClient.runJob(conf2)
>>>
>>> Is this the canonical way to chain jobs? I'm having some trouble with
>>> this
>>> method -- for especially long jobs, the latter MR tasks sometimes do not
>>> start up.
>>>
>>>
>>
>>
>> --
>> Joman Chu
>> AIM: ARcanUSNUMquam
>> IRC: irc.liquid-silver.net
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/How-to-chain-multiple-hadoop-jobs--tp18370089p18452309.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>
>

Reply via email to