Sandhya E Tue, 17 Jul 2007 22:06:14 -0700

Hi

I have two MapReduces running sequentially to accomplish a job. I first
started running the jobs locally in a single machine.
First MapReduce produces a set of keys which were stored inmemory in a Set
instead of output.collect in the reduce. and the second MapReduce working on
different input files looked up the keys from the Set to act on the input
lines. But now I want to run the MapReduces on a small cluster. In memory
storage will not work here. How can the second Map running on various
machines load all the keys from first MapReduce before it starts working on
input files. Any ideas..?


Many Thanks
Sandhya

Reply via email to