Deepika: That sounds very strange. Can you let us know what version of Hadoop (e.g. Apache 0.20.x, CDH2, etc.) you're running and a bit more about your hashCode() implementation? When this happens, do you see the same values for the duplicate key? Did you also implement a grouping comparator?
The hash partitioner is extremely simple. It basically does key.hashCode() % numberOfReduces = partition number to which a key is assigned. If one incorrectly implements a grouping comparator, it's possible you could see odd behavior, though. On Mon, May 24, 2010 at 5:35 PM, Deepika Khera <[email protected]> wrote: > Hi, > > I am using a HashPartitioner on my key for a map reducer job. I am wondering > how sometimes 2 reducers end up getting the same key ? I have the hashCode > method defined for my key. > > Also, I have speculative execution turned off for my jobs.. > > Would appreciate any help. > > Thanks, > Deepika > -- Eric Sammer phone: +1-917-287-2675 twitter: esammer data: www.cloudera.com
