On Wed, Jun 9, 2010 at 3:15 PM, Alex Kozlov <[email protected]> wrote:
> So I assume it is entirely possible to write a partitioner that distributes
> the same key to multiple reducers and it does not have to be
> non-deterministic.  It can assign the partition based on the value.
>
> Is this correct?

Yes. I've never liked the fact that Partitioners get the value for
exactly that reason. It was originally put in for some obscure corner
case in Nutch. Fixing it now would be difficult.

Also note that "non-deterministic" doesn't imply using Random. You
could just fail to overload the hashcode method and take the default
from Object. That would cause you to hash based on the object's
address, which is different for each jvm.

-- Owen

Reply via email to