Dear jamal sasha,
The usual example goes like this:
class Mapper
method MAP (Line l)
document <- split l in Terms t
for all Terms t in document
EMIT(Term t, one)
class Combiner
method REDUCE(Term t, List of Counts lc)
cnt <- sum lc
EMIT(Term t, Count cnt)
class Reducer
method REDUCE(Term t, List of Counts lc)
cnt <- sum lc
EMIT(Term t, Count cnt)
The combiner is run node local on mapper output (before the shuffle). It's
output is used as input to the reducers (after the shuffle). A combiner is
an I/O optimization. There are no guarantees by the framework, if a
combiner will be called at all, one or more times on the output.
Best regards,
Jens