Thank you both. Makes sense
2013/4/11 Scott Carey <[email protected]> > Minor addition, the default value should be > > null > > not > > "null" > > -- the latter is a string, the former is null. > > http://avro.apache.org/docs/current/spec.html#schema_record > > > On 4/9/13 8:42 PM, "Martin Kleppmann" <[email protected]> wrote: > > >With Avro, it is generally assumed that your reader is working with > >the exact same schema as the data was written with. If you want to > >change your schema, e.g. add a field to a record, you still need the > >exact same schema as was used for writing (the "writer's schema"), but > >you can also give the decoder a second schema (the "reader's schema"), > >and Avro will map data from the writer's schema into the reader's > >schema for you ("schema evolution"). > > > >This requirement of having the exact same schema as the writer makes > >more sense with Avro's binary encoding, because it allows Avro to omit > >the field names, which makes the encoding very compact. The > >requirement makes less sense if you're using the JSON encoding, where > >field names are inevitably part of the JSON. I think this behaviour is > >expected, but I agree that it's a bit surprising, so perhaps it's > >worth discussing whether we should change it. > > > >To answer your question, your input data {} looks like it was written > >with a writer schema of {"name":"hey", "type":"record", "fields":[]} > >so try using that as your writer schema. Then if you specify > >{"name":"hey", "type":"record", > >"fields":[{"name":"a","type":["null","string"],"default":"null"}]} as > >your reader schema, you should find that the resolving decoder fills > >in the field "a" with the default null. > > > >Best, > >Martin > > > >On 9 April 2013 02:44, Jonathan Coveney <[email protected]> wrote: > >> Stepping through the code, it looks like the code only uses defaults for > >> writing, not for reading. IE at read time it assumes that the defaults > >>were > >> already filled in. It seems like if the reader evolved the schema to > >>include > >> new fields, it would be desirable for the defaults to get filled in if > >>not > >> present? But stepping through, on reading the defaults are completely > >> ignored. > >> > >> > >> 2013/4/9 Jonathan Coveney <[email protected]> > >>> > >>> Please note: {"name":"hey", "type":"record", > >>> "fields":[{"name":"a","type":["null","string"],"default":"null"}]} also > >>> doesn't work > >>> > >>> > >>> 2013/4/9 Jonathan Coveney <[email protected]> > >>>> > >>>> I have the following schema: {"name":"hey", "type":"record", > >>>> "fields":[{"name":"a","type":["null","string"],"default":null}]} > >>>> > >>>> I am trying to deserialize the following against this schema using > >>>>Java > >>>> and the GenericDatumReader: {} > >>>> > >>>> I get the following error: > >>>> Caused by: org.apache.avro.AvroTypeException: Expected start-union. > >>>>Got > >>>> END_OBJECT > >>>> at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:697) > >>>> at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441) > >>>> at > >>>> > >>>>org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229) > >>>> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88) > >>>> at > >>>> > >>>>org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206 > >>>>) > >>>> at > >>>> > >>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java > >>>>:152) > >>>> at > >>>> > >>>>org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReade > >>>>r.java:177) > >>>> at > >>>> > >>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java > >>>>:148) > >>>> at > >>>> > >>>>org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java > >>>>:139) > >>>> at com.spotify.hadoop.JsonTester.main(JsonTester.java:40) > >>>> > >>>> I'm not seeing any immediate issues online around this...is this > >>>> expected? I'm reading it in as such: > >>>> > >>>> Schema avroSchema = new Schema.Parser().parse(schemaLine); > >>>> GenericDatumReader<Object> reader = new > >>>> GenericDatumReader<Object>(avroSchema); > >>>> Object datum = reader.read(null, > >>>> DecoderFactory.get().jsonDecoder(avroSchema, dataLine)); > >>>> > >>>> I'm going to see what's up and why it isn't picking up the default, > >>>>but > >>>> imagined you guys might know what's up? > >>>> > >>>> Thanks, > >>>> Jon > >>> > >>> > >> > > >
