Re: Reduce method called same key twice

Trevor Adams Wed, 29 Jun 2011 11:34:58 -0700

So, that kind of makes sense but why would it not group the other values
then? There are a bunch of the exact same key (only 1 primary record, so
only 1 that is different per set) and it is my understanding that they would
be grouped together (without the primary key) if I didn't do anything
different.


-Trevor

On Wed, Jun 29, 2011 at 2:07 PM, Aaron Baff <aaron.b...@telescope.tv> wrote:

> You probably need to implement a custom comparator that you use as the
> grouping comparator that compares the primary key, and then if they are the
> same compares the int part of the key.
>
> --Aaron
>
>
> -----------------------------------------------------------------------------
> From: Trevor Adams [mailto:trevorad...@gmail.com]
> Sent: Wednesday, June 29, 2011 10:00 AM
> To: mapreduce-user@hadoop.apache.org
> Subject: Reduce method called same key twice
>
> So I have a custom Key which is used for a join. It contains two fields, a
> boolean (is primary key) and an int (key). Hashcode only looks at the key
> field, so that it gets sent to the same reducer. Compare places the pkey at
> the top of the list (if sorted using compare). This works nicely, except
> that the reduce method is called with Key: 1 -> a single value, Key: 1 ->
> another value etc. One for each value, so instead of bucketing the values to
> a key (and some of the keys are identical, in every way) it sends 1 key and
> 1 value to the reducer at a time. How do I get it to bucket or why isn't it
> bucketing?
>
> -Trevor
>

Re: Reduce method called same key twice

Reply via email to