Re: Modeling WordCount in a different way

Pankil Doshi Mon, 13 Apr 2009 07:58:43 -0700

Hey

Did u find any class or way out for storing results of Job1 map/reduce in
memory and using that as an input to job2 map/Reduce?I am facing a situation
where I need to do similar thing.If anyone can help me out..


Pankil

On Wed, Apr 8, 2009 at 12:51 AM, Sharad Agarwal <shara...@yahoo-inc.com>wrote:

>
> > I have confusion how would I start the next job after finishing the one,
> > could you just make it clear by some rough example.
> See JobControl class to chain the jobs. You can specify dependencies as
> well. You can checkout the TestJobControl class  for example code.
> >
> > Also do I need to use
> > SequenceFileInputFormat to maintain the results in the memory and then
> > accessing it.
> >
> Not really. You have to use the corresponding reader to read the data. For
> example if you have written it using TextOutputFormat(default), you can then
> read it using TextInputFormat. The reader can be created in the reducer
> initialization code. In the new api (org.apache.hadoop.mapreduce.Reducer) it
> can be done in "setup" method. Here you can load the word,count mappings in
> a HashMap.
> In case you don't want to load all data in memory, you can create the
> reader in "setup" method and keep on doing the next
> (LineRecordReader#nextKeyValue())  in the reduce function if the reduce key
> is greater than the current key from the reader.
>
> - Sharad
>

Re: Modeling WordCount in a different way

Reply via email to