Cooper, Chris wrote:
I’m using Avro’s reflection api to publish out domain objects from
Hadoop through messaging middleware. The subscribers of my data are
interested in different subsets of the objects. Instead of them having
to use a version of my original object, is it possible for them to
define a totally different object (different namespace/name) that has a
small subset of the original objects properties (matching
names/primitive types) and then deserialize to that object?
Yes, this is possible.
The simplest way to do this would be to use the generic data
representation, and simply specify a subset of the original object's
schema, i.e., remove fields you're not interested in.
If you need to have a distinct Java class that corresponds to each
subset (rather than a GenericRecord instance) then you could do this
using the specific or reflect data representations, but it'll be more
work. You'd need to subclass SpecificDatumReader or ReflectDatumReader
and override the #newRecord() method to create the class you want to use
to represent your subset schema. This should work, but, since I don't
think anyone has tried it before, there may be ways we can improve Avro
to make this easier. So please tell us how it goes.
Doug