Hi,

another question about writing hadoop  jobs using avro. I want to implement a 
basic shuffle and file aggregation: Mappers emit their input with random keys, 
reducers just write to disk. The number of reducers determines how many files I 
get in the result. The mapred documentation on Jobs where both input and putput 
are avro says:

> Subclass AvroMapper and specify this as your job's mapper with [...]

However, AvroMapper only seems to support input and output values, not keys. 
Did I miss the obvious here?

Thanks,

Markus

PS: Ideally, I'd implement the shuffle without ever deserializing the data, 
which should be possible. But that is the next step.

Reply via email to