On Wed, Jun 9, 2010 at 3:15 PM, Alex Kozlov <[email protected]> wrote: > So I assume it is entirely possible to write a partitioner that distributes > the same key to multiple reducers and it does not have to be > non-deterministic. It can assign the partition based on the value. > > Is this correct?
Yes. I've never liked the fact that Partitioners get the value for exactly that reason. It was originally put in for some obscure corner case in Nutch. Fixing it now would be difficult. Also note that "non-deterministic" doesn't imply using Random. You could just fail to overload the hashcode method and take the default from Object. That would cause you to hash based on the object's address, which is different for each jvm. -- Owen
