Myles Grant wrote:
I would like the values for a key to exist in a single file, and only the values for that key.
Reducer.reduce() gets invoked once per key, i.e just once per key along with all the values associated with it.
Reducer.reduce(key,<value1, value2, value3 ....);
So what I suggested should help you generate one file per key. Since you have an iterator over all the values associated with that key you don't have to do much and since the input to the reducer is sorted you can be sure that all the values for the key are passed to Reducer.reduce().
Amar
Each reduced key/value would get its own file. If I understand correctly, all output of the reducers is written to a single file.

-Myles

On Jan 16, 2008, at 9:29 PM, Amar Kamat wrote:

Hi,
Why couldn't you just write this logic in your reducer class. The reduce [reduceClass.reduce()] method is invoked with a key and an iterator over the values associated with the key. You can simply dump the values into a file. Since the input to the reducer is sorted you can simply dump the values to a file i.e no bookkeeping is required. I think this is what you wanted. no?
Myles Grant wrote:
Hello,

I'd like me reduce tasks to each output a single file per key, containing the value. Each file would be named with the key. It appears that I need to (at least) create a new OutputFormat and possible a RecordWriter. As doing this would likely involve a lot of trial and error on my part, I was curious if someone had implemented this already and would like to share. I will be needing both versions that write text files and binary files eventually.

Short a full existing implementation that I can steal, how about some hints?

Cheers,
Myles



Reply via email to