Hi, another question about writing hadoop jobs using avro. I want to implement a basic shuffle and file aggregation: Mappers emit their input with random keys, reducers just write to disk. The number of reducers determines how many files I get in the result. The mapred documentation on Jobs where both input and putput are avro says:
> Subclass AvroMapper and specify this as your job's mapper with [...] However, AvroMapper only seems to support input and output values, not keys. Did I miss the obvious here? Thanks, Markus PS: Ideally, I'd implement the shuffle without ever deserializing the data, which should be possible. But that is the next step.
