Cooper, Chris wrote:
I’m using Avro’s reflection api to publish out domain objects from Hadoop through messaging middleware. The subscribers of my data are interested in different subsets of the objects. Instead of them having to use a version of my original object, is it possible for them to define a totally different object (different namespace/name) that has a small subset of the original objects properties (matching names/primitive types) and then deserialize to that object?

Yes, this is possible.

The simplest way to do this would be to use the generic data representation, and simply specify a subset of the original object's schema, i.e., remove fields you're not interested in.

If you need to have a distinct Java class that corresponds to each subset (rather than a GenericRecord instance) then you could do this using the specific or reflect data representations, but it'll be more work. You'd need to subclass SpecificDatumReader or ReflectDatumReader and override the #newRecord() method to create the class you want to use to represent your subset schema. This should work, but, since I don't think anyone has tried it before, there may be ways we can improve Avro to make this easier. So please tell us how it goes.

Doug

Reply via email to