Mario Eberhard created AVRO-2236: ------------------------------------ Summary: Java Avro Default Value restrictions to first union type leaks to usage of record types Key: AVRO-2236 URL: https://issues.apache.org/jira/browse/AVRO-2236 Project: Avro Issue Type: Bug Components: java Affects Versions: 1.8.2 Reporter: Mario Eberhard Attachments: avrodefaulttest.zip
Default values are restricted to being of the same type as the first type in a union. {code:java} { "type": "record", "name": "ComplexValue", "fields": [ { "name": "value", "type": [ "null", "long" ], "default": null } ] } {code} This works as documented. However, the restriction also applies to default values of this record type: {code:java} { "type": "record", "name": "ExampleRecord", "namespace": "com.example", "fields": [ { "name": "value1", "type": { "type": "record", "name": "ComplexValue", "fields": [ { "name": "value", "type": [ "null", "long" ], "default": null } ] } }, { "name": "value2", "type": "ComplexValue", "default": { "value": 15 } } ] } {code} In this case the record "ExampleRecord" has a field "value2" of type "ComplexType". This field is not optional but has a default to be able to read instance where this field is not present. During deserialization the following error is thrown: {code:java} org.apache.avro.AvroTypeException: Non-null default value for null type: 15 at org.apache.avro.io.parsing.ResolvingGrammarGenerator.encode(ResolvingGrammarGenerator.java:413) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.encode(ResolvingGrammarGenerator.java:365) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.encode(ResolvingGrammarGenerator.java:335) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.getBinary(ResolvingGrammarGenerator.java:307) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.resolveRecords(ResolvingGrammarGenerator.java:285) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.generate(ResolvingGrammarGenerator.java:118) at org.apache.avro.io.parsing.ResolvingGrammarGenerator.generate(ResolvingGrammarGenerator.java:50) at org.apache.avro.io.ResolvingDecoder.resolve(ResolvingDecoder.java:85) at org.apache.avro.io.ResolvingDecoder.<init>(ResolvingDecoder.java:49) at org.apache.avro.io.DecoderFactory.resolvingDecoder(DecoderFactory.java:307) at org.apache.avro.generic.GenericDatumReader.getResolver(GenericDatumReader.java:128) at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:143) at com.example.SerializationTest.lambda$getDeserializer$1(SerializationTest.java:75) at com.example.SerializationTest.valueReadWithCorrectDefaultValue(SerializationTest.java:41) {code} I would argue that a concrete instance of "ComplexValue" with a specific value should be allowed in this case. I see no reason why the default restriction of the underlying schema should even apply. My guess is, that this is an unintended consequence of code reuse in the java client. I added an example gradle project as attachment. Run the tests to reproduce the above example. -- This message was sent by Atlassian JIRA (v7.6.3#76005)