What objects are you referring to? I'm not sure I understand your question. - Aaron
On Tue, May 11, 2010 at 6:38 AM, Renato Marroquín Mogrovejo < [email protected]> wrote: > Thanks Aaron! I was thinking the same after doing some reading. > Man what about serialize the objects? Would you think that is a good idea? > Thanks again. > > Renato M. > > > 2010/5/5 Aaron Kimball <[email protected]> > > > Renato, > > > > In general if you need to perform a multi-pass MapReduce workflow, each > > pass > > materializes its output to files. The subsequent pass then reads those > same > > files back in as input. This allows the workflow to start at the last > > "checkpoint" if it gets interrupted. There is no persistent in-memory > > distributed storage feature in Hadoop that would allow a MapReduce job to > > post results to memory for consumption by a subsequent job. > > > > So you would just read your initial data from /input, and write your > > interim > > results to /iteration0. Then the next pass reads from /iteration0 and > > writes > > to /iteration1, etc.. > > > > If your data is reasonably small and you think it could fit in memory > > somewhere, then you could experiment with using other distributed > key-value > > stores (memcached[b], hbase, cassandra, etc..) to hold intermediate > > results. > > But this will require some integration work on your part. > > - Aaron > > > > On Wed, May 5, 2010 at 8:29 AM, Renato Marroquín Mogrovejo < > > [email protected]> wrote: > > > > > Hi everyone, I have recently started to play around with hadoop, but I > am > > > getting some into some "design" problems. > > > I need to make a loop to execute the same job several times, and in > each > > > iteration get the processed values (not using a file because I would > need > > > to > > > read it). I was using an static vector in my main class (the one that > > > iterates and executes the job in each iteration) to retrieve those > > values, > > > and it did work while I was using a standalone mode. Now I tried to > test > > it > > > on a pseudo-distributed manner and obviously is not working. > > > Any suggestions, please??? > > > > > > Thanks in advance, > > > > > > > > > Renato M. > > > > > >
