Re: Run Map-Reduce multiple times

Delip Rao Fri, 26 Dec 2008 14:07:40 -0800

Thanks Jason for pointing about the ChainMapper. Although it's not
directly useful for the problem in this email, it's an awesome way to
pipeline several mappers. Quite useful if you've multiple
pre-processing steps. For archival purposes, here's a link with a good
example.


http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/lib/ChainMapper.html

Coming back to the subject of this email, yes I did something similar
to what Chris and Xiance noted below. Turns out Mahout also does the
same thing too.

Cheers,
Delip

On Tue, Dec 23, 2008 at 6:24 PM, Jason Venner <[email protected]> wrote:
> in 19 there is a chaining facility, I haven't looked at it yet, but it may
> provide an alternative to the rather standard pattern of looping.
>
> You may also what to check what mahout is doing as it is a common problem in
> that space.
>
> Delip Rao wrote:
>>
>> Thanks Chris! I ended up doing something similar too.
>>
>> On Mon, Dec 8, 2008 at 2:29 AM, Chris Dyer <[email protected]> wrote:
>>
>>>
>>> Hey Delip-
>>> mapreduce doesn't really have any particular support for iterative
>>> algorithms.  You just have to put a loop in the control program and
>>> set the output path from the previous iteration to be the input path
>>> in the next iteration.  This at least lets you control whether you
>>> decide to keep around results of intermediate iterations or erase
>>> them...
>>> -Chris
>>>
>>> On Mon, Dec 8, 2008 at 1:25 AM, Delip Rao <[email protected]> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I need to run my map-reduce routines for several iterations so that
>>>> the output of an iteration becomes the input to the next iteration. Is
>>>> there a standard pattern to do this instead of calling
>>>> JobClient.runJob() in a loop?
>>>>
>>>> Thanks,
>>>> Delip
>>>>
>>>>
>

Re: Run Map-Reduce multiple times

Reply via email to