> > In the case of data files specifically, it makes sense to have weaker > matching strictness by default.
yes, this seems quite reasonable. perhaps a simplified API for aliasing schemas when records in a data file only mismatch the specific type in a few specific, "harmless" ways. as doug points out, though, with his Date/Time example, its hard to say what a "harmless" way might be. this is a bigger fish than i'd care to fry right now, though :) thanks for your help! On Tue, Oct 5, 2010 at 9:26 AM, Scott Carey <[email protected]> wrote: > I understand what you are describing and how it may not be consistent with > the spec. > > I don't have any time to look at it at the moment however. > > This is certainly a use case that makes a lot of sense, so even if we > become more strict here there will be a way to achieve namespace migration. > In the case of data files specifically, it makes sense to have weaker > matching strictness by default. > > > On Oct 4, 2010, at 6:38 PM, Patrick Linehan wrote: > > i'd be happy to create a fully-working code example if that would help. i > have some firewall issues that prevent me from attaching the actual code i'm > actually working with. > > On Mon, Oct 4, 2010 at 2:06 PM, Patrick Linehan <[email protected]>wrote: > >> the "problem" i'm having is that i seem to be getting alias-like >> functionality without using aliases. i put "problem" in quotes because i >> actually like the behavior, i just don't see how it jives with the spec. >> maybe a code example is a better way to go about this. >> >> i create a data file as follows: >> >> Schema schemaA = ... >> Schema schemaB = ... >> GenericDatumWriter datumWriter = new GenericDatumWriter(schemaA); >> DataFileWriter fileWriter = new DataFileWriter(datumWriter); >> OutputStream out = new FileOutputStream("datafile.avro"); >> fileWriter.create(schemaA, out); >> fileWriter.append(<RECORD>); >> fileWriter.close(); >> >> both schemaA and schemaB contain a single record definition, each with >> exactly the same primitive-type fields; same types, same names, same order. >> however, the record names and namespaces differ. >> >> using "avro-tools getschema", i can see that the schema stored in the file >> is schemaA. also, if i create a GenericDatumReader and read the file, the >> returned GenericRecord values have a schema of schemaA. >> >> however, i can also read the file using a SpecificDatumReader which is >> initialized to the specific type corresponding to schemaB (let's call that >> class RecordB), the schema which does _not_ match the schema of the file: >> >> SpecificDatumReader datumReader = new SepcificDatumReader(RecordB.class); >> DataFileReader fileReader = new DataFileReader(new File("datafile.avro"), >> datumReader); >> RecordB record = fileReader.next(); >> fileReader.close(); >> >> examining the fields of "record" i see that the data has been parsed >> correctly, as if RecordB's schema (the "reader's schema") was correctly >> resolved with schemaA (the "writer's schema"). >> >> is this the expected behavior in this case? does this not seem to >> contradict the schema resolution portions of the spec? is this behavior >> specific to DataFileReader, since i "forced" the record type upon the >> reader? >> >> also, thanks for taking the time to reply. i very much appreciate it. >> >> sincerely, >> Confused >> >> On Mon, Oct 4, 2010 at 1:10 PM, Doug Cutting <[email protected]> wrote: >> >>> On 10/01/2010 05:45 PM, Patrick Linehan wrote: >>> >>>> am i misunderstanding the documentation? is the behavior i'm seeing >>>> expected? when does a record name/namespace conflict actually cause an >>>> error to be thrown? >>>> >>> >>> The alias feature in Avro 1.4 will let you read records whose name or >>> namespace differ: >>> >>> http://avro.apache.org/docs/current/spec.html#Aliases >>> >>> Does that help? >>> >>> Doug >>> >> >> > >
