Thanks Jason for pointing about the ChainMapper. Although it's not directly useful for the problem in this email, it's an awesome way to pipeline several mappers. Quite useful if you've multiple pre-processing steps. For archival purposes, here's a link with a good example.
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/lib/ChainMapper.html Coming back to the subject of this email, yes I did something similar to what Chris and Xiance noted below. Turns out Mahout also does the same thing too. Cheers, Delip On Tue, Dec 23, 2008 at 6:24 PM, Jason Venner <[email protected]> wrote: > in 19 there is a chaining facility, I haven't looked at it yet, but it may > provide an alternative to the rather standard pattern of looping. > > You may also what to check what mahout is doing as it is a common problem in > that space. > > Delip Rao wrote: >> >> Thanks Chris! I ended up doing something similar too. >> >> On Mon, Dec 8, 2008 at 2:29 AM, Chris Dyer <[email protected]> wrote: >> >>> >>> Hey Delip- >>> mapreduce doesn't really have any particular support for iterative >>> algorithms. You just have to put a loop in the control program and >>> set the output path from the previous iteration to be the input path >>> in the next iteration. This at least lets you control whether you >>> decide to keep around results of intermediate iterations or erase >>> them... >>> -Chris >>> >>> On Mon, Dec 8, 2008 at 1:25 AM, Delip Rao <[email protected]> wrote: >>> >>>> >>>> Hi, >>>> >>>> I need to run my map-reduce routines for several iterations so that >>>> the output of an iteration becomes the input to the next iteration. Is >>>> there a standard pattern to do this instead of calling >>>> JobClient.runJob() in a loop? >>>> >>>> Thanks, >>>> Delip >>>> >>>> >
