I have noticed this too with one job. Keys that are equal (.equals(), hashCode() === and compareTo === 0) are being sent to multiple reduce tasks therefore resulting in incorrect output.
Any insight? On Sat, Aug 13, 2011 at 11:14 AM, Stan Rosenberg < [email protected]> wrote: > Hi All, > > Here is what's happening. I have implemented my own WritableComparable > keys > and values. > Inside a reducer I am seeing 'reduce' being invoked with the "same" key > _twice_. > I have checked that context.getKeyComparator() and > context.getSortComparator() are both WritableComparator which > indicates that 'compareTo' method of my key should be called when doing > reduce-side merge. > > Indeed, inside the 'reduce' method I captured both key instances and did > the > following checks: > > ((WritableComparator)context.getKeyComparator()).compare((Object)key1, > (Object)key2) > ((WritableComparator)context.getSortComparator()).compare((Object)key2, > (Object)key2) > > In both calls, the result is '0', confirming that key1 and key2 are > equivalent. > > So, what is going on? > > Note that key1 and key2 come from different mappers but they should have > been collapsed in the reducer since > they are both equal according to WritableComparator. Also note that key1 > and key2 are not bitwise equivalent, but > that shouldn't matter, or should it? > > Many thanks in advance! > > stan >
