For questions 1 and 2 - you certainly can implement either of those ideas.
 "What happens" when you output more than one value... well, you output more
than one value!

The correctness of your code depends on the logic in the reducer stage.
 However, keep in mind that there are no guarantees on if and when the
combiner is run: it could be executed zero, one or multiple times.  For that
reason, combiners tend to be simple.

For question 3, that situation would likely cause an error.  Since the
combiner may or may not be executed, the reducer could get keys of different
types.

If you're looking to do something more complex, I suggest you look at
"in-mapper combining".  Jimmy Lin's book (
http://www.umiacs.umd.edu/~jimmylin/book.html) is an excellent reference.

cheers,
-jw

On Mon, May 23, 2011 at 12:14 PM, Mike Spreitzer <[email protected]>wrote:

> Question 1 remains: What happens if one invocation of a combiner outputs
> more than one value?
>
> My main interest in question 2 was about instances not classes, so let me
> rephrase question 2 this way: What happens if an output key object is not
> equal to the input key object (even though both are of the same class)?
>
> Even for question 3, I did not exactly see an answer to "what happens" ---
> only a statement that I should not exercise that case.
>
> Thanks,
> Mike
>
>
>
> From:        Ted Yu <[email protected]>
> To:        [email protected]
> Date:        05/23/2011 03:04 PM
> Subject:        Re: Stupid questions about combiners in
> ...hadoop.mapreduce
> ------------------------------
>
>
>
> Questions 2 and 3 can be answered relatively easily:
> Remember, the output of the combiner is going to be consumed by the
> reducer.
> So the output key/vlaue classes of the combiner have to align with the
> input key/vlaue classes of the reducer.
>
> On Mon, May 23, 2011 at 11:32 AM, Mike Spreitzer 
> <*[email protected]*<[email protected]>>
> wrote:
> In general, the Java interfaces say that one invocation of a combiner
> (technically, a Class<? *extends* Reducer>) can output multiple
> (key,value) pairs.  So:
>
> What happens if one invocation of a combiner outputs more than one value?
>
> What happens if an output key is different from the input key?
>
> What happens if an output value is of a different class than the class of
> the input values?
>
> Thanks,
> Mike Spreitzer
>
>

Reply via email to