Keys between Mapper and Reducer in AvroJobs

Markus Weimer Mon, 18 Apr 2011 16:44:02 -0700

Hi,

another question about writing hadoop  jobs using avro. I want to implement a 
basic shuffle and file aggregation: Mappers emit their input with random keys, 
reducers just write to disk. The number of reducers determines how many files I 
get in the result. The mapred documentation on Jobs where both input and putput 
are avro says:


> Subclass AvroMapper and specify this as your job's mapper with [...]

However, AvroMapper only seems to support input and output values, not keys. 
Did I miss the obvious here?

Thanks,

Markus

PS: Ideally, I'd implement the shuffle without ever deserializing the data, 
which should be possible. But that is the next step.

Keys between Mapper and Reducer in AvroJobs

Reply via email to