Hi Doug,

I seem to hit a case not covered by the mapred package documentation:
I'd like to read from a TextInputFormat and produce AVRO data in a
map-only job. How Do I do that?

In short, the way to do this is to:
- use a org.apache.hadoop.mapred.Mapper<K,V,AvroWrapper<O>,NullWritable>
- call AvroJob.setOutputSchema(job,schema) with O's schema

Does that make sense? If that works for you, I can add it to the javadoc.

Yes, it worked. Incidently, it also reduced my file size to 33% of my previous custom-avro-writable-in-sequence-file approach.

Thanks,

Markus

Reply via email to