might be cool to special case a reduce on sorted input.
On Apr 18, 2006, at 12:28 PM, Doug Cutting wrote:
Stefan Groschupf wrote:
what is the reason that each job that has no mapper defined runs
the IdentityMapper?
Handling different formats (as discussed) between mapping and
reducing is difficult.
Having one job that just map in the one format and having another
job that just reduce
in a other format would be a nice workaround of the format
problem but the IdentityMapper makes this workaround impossible.
Stefan,
I don't understand the problem here. Some map function is required
for any data to make it to reduce. IdentityMapper simply copies
all map input without altering it. How does this cause you
problems? Would you prefer a NullMapper by default, that does
nothing? That would result in no output sent to reduce.
Thanks,
Doug