Werner Daehn created AVRO-2899:
----------------------------------
Summary: JsonEncoder writes type information for not-null union
Key: AVRO-2899
URL: https://issues.apache.org/jira/browse/AVRO-2899
Project: Apache Avro
Issue Type: Bug
Components: java
Affects Versions: 1.9.2
Reporter: Werner Daehn
_Summary: A union of [null,string] should output the value as such when using
the JsonEncoder. To accomplish that, a single line needs to be changed in the
JsonEncoder.java. I don't believe there are side effects but not sure - looking
for validation._
When the schema looks like
{{name: "text", type: ["null",\{"type":"string"}] }}
the JsonEncoder creates a Json object explicitly stating the type. So the
created json is
{{text: \{ "string": "Hello World" }}}
instead of
{{text: "Hello World"}}
I have searched for this issue and people complain frequently but no real
resolution.
While I understand why this is done, the JsonDecoder needs to know what type to
add in case of a union, it does not make sense in a not-null union where there
is either a string value or not.
I would argue, in the simple case where there is a union of two elements and
the first is the NULL symbol, the extra text can be omitted. Hence I have
changed the writeIndex line in the JsonEncoder from
[https://github.com/apache/avro/blob/c903aa6d6fc42d3c347f95d469a8364ea44165e8/lang/java/avro/src/main/java/org/apache/avro/io/JsonEncoder.java#L297]
{{if (symbol != Symbol.NULL) {}}
to
{{if (symbol != Symbol.NULL && (top.symbols.length > 2 || top.getSymbol(0) !=
Symbol.NULL)) {}}
It should not have a side effect on the JsonDecoder either, as in case of
schema evolution - making a field null-able - this must be resolved anyhow. I
am not sure about that however.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)