This is probably a simple question but when I run my MR job I am getting
10 splits and therefore 10 output files like part-x. Is there a way
to merge those outputs into a single file using the currently running MR
job or do I need to run another MR job to merge them?
Dennis Kubes
To generate a single output file, specify just a single reduce task. If
your reducer isn't doing much computation, then it might be faster to do
this in the original job, otherwise use a subsequent job.
Doug
Dennis Kubes wrote:
This is probably a simple question but when I run my MR job I am
I asked me the same question when I stepped into Hadoop, and I think
it's a good candidate for FAQ ;)
Generally speaking, IMO there is a need in Hadoop (MapReduce part) for
some kind of JobListener interface, allowing to write custom callbacks
called at strategic moments of a Job's life, and