Re: Do I have to sort?

John Armstrong Mon, 18 Jun 2012 07:54:12 -0700

On 06/18/2012 10:40 AM, Mark Kerzner wrote:

that sounds very interesting, and I may implement such a workflow, but
can I write back to HDFS in the mapper? In the reducer it is a standard
context.write(), but it is a different context.

Both Mapper.Context and Reducer.Context descend fromTaskInputOutputContext, which is where the write() method is defined, sothey're both outputting their data in the same way.

If you don't have a Reducer -- only Mappers and fully parallel dataprocessing -- then when you configure your job you set the number ofreducers to zero. Then the mapper context knows that mapper output isthe last step, so it uses the specified OutputFormat to write out thedata, just like your reducer context currently does with reducer output.

Re: Do I have to sort?

Reply via email to