Myles Grant wrote:
I would like the values for a key to exist in a single file, and only
the values for that key.
Reducer.reduce() gets invoked once per key, i.e just once per key along
with all the values associated with it.
Reducer.reduce(key,<value1, value2, value3 ....);
So what I suggested should help you generate one file per key. Since you
have an iterator over all the values associated with that key you don't
have to do much and since the input to the reducer is sorted you can be
sure that all the values for the key are passed to Reducer.reduce().
Amar
Each reduced key/value would get its own file. If I understand
correctly, all output of the reducers is written to a single file.
-Myles
On Jan 16, 2008, at 9:29 PM, Amar Kamat wrote:
Hi,
Why couldn't you just write this logic in your reducer class. The
reduce [reduceClass.reduce()] method is invoked with a key and an
iterator over the values associated with the key. You can simply dump
the values into a file. Since the input to the reducer is sorted you
can simply dump the values to a file i.e no bookkeeping is required.
I think this is what you wanted. no?
Myles Grant wrote:
Hello,
I'd like me reduce tasks to each output a single file per key,
containing the value. Each file would be named with the key. It
appears that I need to (at least) create a new OutputFormat and
possible a RecordWriter. As doing this would likely involve a lot
of trial and error on my part, I was curious if someone had
implemented this already and would like to share. I will be needing
both versions that write text files and binary files eventually.
Short a full existing implementation that I can steal, how about
some hints?
Cheers,
Myles