Stefan Groschupf wrote:
what is the reason that each job that has no mapper defined runs the
IdentityMapper?
Handling different formats (as discussed) between mapping and reducing
is difficult.
Having one job that just map in the one format and having another job
that just reduce
in a other format would be a nice workaround of the format problem but
the IdentityMapper makes this workaround impossible.
Stefan,
I don't understand the problem here. Some map function is required for
any data to make it to reduce. IdentityMapper simply copies all map
input without altering it. How does this cause you problems? Would you
prefer a NullMapper by default, that does nothing? That would result in
no output sent to reduce.
Thanks,
Doug
- Re: IdentityMapper Doug Cutting
-