I understand what you are describing and how it may not be consistent with the spec.
I don't have any time to look at it at the moment however. This is certainly a use case that makes a lot of sense, so even if we become more strict here there will be a way to achieve namespace migration. In the case of data files specifically, it makes sense to have weaker matching strictness by default. On Oct 4, 2010, at 6:38 PM, Patrick Linehan wrote: i'd be happy to create a fully-working code example if that would help. i have some firewall issues that prevent me from attaching the actual code i'm actually working with. On Mon, Oct 4, 2010 at 2:06 PM, Patrick Linehan <[email protected]<mailto:[email protected]>> wrote: the "problem" i'm having is that i seem to be getting alias-like functionality without using aliases. i put "problem" in quotes because i actually like the behavior, i just don't see how it jives with the spec. maybe a code example is a better way to go about this. i create a data file as follows: Schema schemaA = ... Schema schemaB = ... GenericDatumWriter datumWriter = new GenericDatumWriter(schemaA); DataFileWriter fileWriter = new DataFileWriter(datumWriter); OutputStream out = new FileOutputStream("datafile.avro"); fileWriter.create(schemaA, out); fileWriter.append(<RECORD>); fileWriter.close(); both schemaA and schemaB contain a single record definition, each with exactly the same primitive-type fields; same types, same names, same order. however, the record names and namespaces differ. using "avro-tools getschema", i can see that the schema stored in the file is schemaA. also, if i create a GenericDatumReader and read the file, the returned GenericRecord values have a schema of schemaA. however, i can also read the file using a SpecificDatumReader which is initialized to the specific type corresponding to schemaB (let's call that class RecordB), the schema which does _not_ match the schema of the file: SpecificDatumReader datumReader = new SepcificDatumReader(RecordB.class); DataFileReader fileReader = new DataFileReader(new File("datafile.avro"), datumReader); RecordB record = fileReader.next(); fileReader.close(); examining the fields of "record" i see that the data has been parsed correctly, as if RecordB's schema (the "reader's schema") was correctly resolved with schemaA (the "writer's schema"). is this the expected behavior in this case? does this not seem to contradict the schema resolution portions of the spec? is this behavior specific to DataFileReader, since i "forced" the record type upon the reader? also, thanks for taking the time to reply. i very much appreciate it. sincerely, Confused On Mon, Oct 4, 2010 at 1:10 PM, Doug Cutting <[email protected]<mailto:[email protected]>> wrote: On 10/01/2010 05:45 PM, Patrick Linehan wrote: am i misunderstanding the documentation? is the behavior i'm seeing expected? when does a record name/namespace conflict actually cause an error to be thrown? The alias feature in Avro 1.4 will let you read records whose name or namespace differ: http://avro.apache.org/docs/current/spec.html#Aliases Does that help? Doug
