On 09/22/2011 11:23 AM, Matt Pouttu-Clarke wrote: > On our project we use the schema evolution features of Avro, and we also > need to select data without knowing the Schema(s) a priori. In other > words we have a large number of Avro files with an evolving schema, and > we may project with any historical schema, or not project any schema at > all. In the case of not being able to specify a specific schema, we > could not find a RecordReader which supports GenericDatumReader.
SpecificDatumReader extends GenericDatumReader. If no class is defined corresponding to a record then SpecificDatumReader will produce a GenericRecord. The only case where this might not be what you want is if you have a specific, generated class loaded but wish to force the use of a GenericRecord, e.g., if the schema of the data written includes fields not in the class that's loaded and you wish to read those fields. In that case perhaps we should add a feature permitting one to force the use of GenericDatumReader in AvroInputFormat? If so, please file an issue in Jira. Doug
