Hey Did u find any class or way out for storing results of Job1 map/reduce in memory and using that as an input to job2 map/Reduce?I am facing a situation where I need to do similar thing.If anyone can help me out..
Pankil On Wed, Apr 8, 2009 at 12:51 AM, Sharad Agarwal <shara...@yahoo-inc.com>wrote: > > > I have confusion how would I start the next job after finishing the one, > > could you just make it clear by some rough example. > See JobControl class to chain the jobs. You can specify dependencies as > well. You can checkout the TestJobControl class for example code. > > > > Also do I need to use > > SequenceFileInputFormat to maintain the results in the memory and then > > accessing it. > > > Not really. You have to use the corresponding reader to read the data. For > example if you have written it using TextOutputFormat(default), you can then > read it using TextInputFormat. The reader can be created in the reducer > initialization code. In the new api (org.apache.hadoop.mapreduce.Reducer) it > can be done in "setup" method. Here you can load the word,count mappings in > a HashMap. > In case you don't want to load all data in memory, you can create the > reader in "setup" method and keep on doing the next > (LineRecordReader#nextKeyValue()) in the reduce function if the reduce key > is greater than the current key from the reader. > > - Sharad >