Re: How to retrieve the reducer output file names?

Tarandeep Singh Sat, 12 Sep 2009 14:21:10 -0700

The output of mappers is partitioned, each partition is given a number
starting from 0 and a reducer works on one of these partitions. In the
configure method of your reducer code, you can get the partition number by-
jobConf.getInt( "mapred.task.partition", 0);

If you use the default output format, then the reducer working on partition
0 will output part-00000, reducer working on partition 1 will output
part-00001 etc.

You can extend TextOutputFormat or SequenceFileOutputFormat (depending upon
which output format you are using) and change the file name from part-xxxxx
to some one else.

Hope this helps,
Tarandeep

On Sat, Sep 12, 2009 at 1:39 PM, Richard G <[email protected]> wrote:

>
> Hi,
>
> For my application, I need to retrieve the output file name for each
> reducer. But is there any convenient way to do that? I also want to know
> which file is coming from which reducer. So simple enumeration in output
> directory doesn't work for me.
>
> Thank you!
> --
> View this message in context:
> http://www.nabble.com/How-to-retrieve-the-reducer-output-file-names--tp25418039p25418039.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>

Re: How to retrieve the reducer output file names?

Reply via email to