Re: computing conditional probabilities with Hadoop?

Ted Dunning Mon, 01 Oct 2007 19:33:09 -0700


Actually, it would be almost as useful to be able to have a "multi-reduce".

In such a system, you would specify multiple input/map pairs.  The reduce
function signature would then be something like:

    reduce(WritableComparable key, OutputCollector, Reporter, Iterator ...)

Where the output of each set of maps would be given its own iterator.

I didn't mention this alternative earlier because I figured it would be a
much bigger leap than just ordering the reduce values.  It would, however,
be very useful when it comes to co-grouping operations.

On 10/1/07 6:17 PM, "Ted Dunning" <[EMAIL PROTECTED]> wrote:

> 
> This is a common requirement.
> 
> Left unchanged would be fine but is probably very hard to enforce because of
> the many map tasks and some uncertainty about which maps finished first.
> Similarly useful would be the ability to require a particular sort ordering
> on reduce values.
> 
> 
> On 10/1/07 6:05 PM, "Chris Dyer" <[EMAIL PROTECTED]> wrote:
> 
>> Does anyone know if Hadoop guarantees (can be made to guarantee) that the
>> relative order of keys that are equal will be left unchanged?
>

Re: computing conditional probabilities with Hadoop?

Reply via email to