Combiners should implement a specialized "Combiner" interface, not the generic
"Reducer" interface
--------------------------------------------------------------------------------------------------
Key: MAPREDUCE-1574
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1574
Project: Hadoop Map/Reduce
Issue Type: Improvement
Affects Versions: 0.20.1
Reporter: Danny Leshem
Priority: Minor
I just spent 30 minutes trying to figure out why my job throws
"java.io.IOException: wrong key class" when I pass my Reducer class to
Job.setCombinerClass. Finally, I understood that a Reducer can act as Combiner
only if its output key/value are the same as its input key/value.
So yes, this is documented. But you can make life easier for users by defining
a Combiner interface (that Job.setCombinerClass will accept) to force this at
compile time. The new interface should implement the Reducer interface and
specialize it (is it even possible with generics?). Alternatively, you can call
this interface "SimpleReducer".
If the generics-trick suggested above is impossible to implement, for the
(common?) case of having the same class acting as Combiner and Reducer you can
do one of either:
1) Thin Combiner implementation that wraps a given Reducer.
2) Add a new method, say Job.setCombinerClassToReducer (that accepts a
Reducer), acting similarly to the new Job.setCombinerClass - but here the name
should alert the user she's doing something special.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.