Hi,

I use my own set of classes for this. I mostly copied from / modeled after the 
Avro mapred support for the old API.

My approach is slightly different, though. The existing MR support fully 
abstracts / wraps away the Hadoop MR API and only exposes the Avro one. The 
only Hadoop API that the Avro classes see is the Configuration object. Problem 
is that in the new API, the Configuration object is kept within a context 
instance and you'd need to wrap the whole context thing and give the wrapper to 
the Avro mapper and reducer. This felt a bit overkill so I chose to just make 
mapper and reducer subclasses that handle the Avro work and then call a 
protected method to do the actual mapping or reducing. Problem is that you lose 
the property of a bare mapper or reducer being the identity function, but you 
could reintroduce this in a generic way, I think. I just don't use the identity 
functions a lot in practice, so I didn't bother.

I pushed the code here: https://github.com/friso/avro-mapreduce. There is a 
unit test with some usage examples.


Cheers,
Friso



On 11 nov. 2011, at 20:43, Doug Cutting wrote:

On 11/10/2011 12:38 AM, Andrew Kenworthy wrote:
Are there plans to extend it to work with org.apache.hadoop.mapreduce as
well?

There's an issue in Jira for this:

https://issues.apache.org/jira/browse/AVRO-593

I don't know of anyone actively working on this at present.  It would be
a great addition to Avro and I am hopeful someone will resume work on it
soon.

Doug

Reply via email to