Note that performance will be very slow in the sort if you don't also define
a RawComparator that compares the serialized forms of the keys. Look at
IntWritable for how to do it.

You need to define a reasonable hashCode because the default partitioner
uses it to decide which reduce to send it to. If you can define your own
partitioner, you could have all of the keys with the same first string go to
the same reduce for instance.

And yes, the function you need to define, assuming you don't have a
RawComparator, is compareTo, not equals.

Reply via email to