Re: Single output file per reduce key?

2008-01-16 Thread Amar Kamat

Hi,
Why couldn't you just write this logic in your reducer class. The reduce 
[reduceClass.reduce()] method is invoked with a key and an iterator over 
the values associated with the key. You can simply dump the values into 
a file. Since the input to the reducer is sorted you can simply dump the 
values to a file i.e no bookkeeping is required. I think this is what 
you wanted. no?

Myles Grant wrote:

Hello,

I'd like me reduce tasks to each output a single file per key, 
containing the value. Each file would be named with the key.  It 
appears that I need to (at least) create a new OutputFormat and 
possible a RecordWriter.  As doing this would likely involve a lot of 
trial and error on my part, I was curious if someone had implemented 
this already and would like to share.  I will be needing both versions 
that write text files and binary files eventually.


Short a full existing implementation that I can steal, how about some 
hints?


Cheers,
Myles




Re: Single output file per reduce key?

2008-01-16 Thread Myles Grant
I would like the values for a key to exist in a single file, and only  
the values for that key.  Each reduced key/value would get its own  
file.  If I understand correctly, all output of the reducers is  
written to a single file.


-Myles

On Jan 16, 2008, at 9:29 PM, Amar Kamat wrote:


Hi,
Why couldn't you just write this logic in your reducer class. The  
reduce [reduceClass.reduce()] method is invoked with a key and an  
iterator over the values associated with the key. You can simply  
dump the values into a file. Since the input to the reducer is  
sorted you can simply dump the values to a file i.e no bookkeeping  
is required. I think this is what you wanted. no?

Myles Grant wrote:

Hello,

I'd like me reduce tasks to each output a single file per key,  
containing the value. Each file would be named with the key.  It  
appears that I need to (at least) create a new OutputFormat and  
possible a RecordWriter.  As doing this would likely involve a lot  
of trial and error on my part, I was curious if someone had  
implemented this already and would like to share.  I will be  
needing both versions that write text files and binary files  
eventually.


Short a full existing implementation that I can steal, how about  
some hints?


Cheers,
Myles