[
https://issues.apache.org/jira/browse/AVRO-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yubao Liu updated AVRO-2772:
----------------------------
Description:
{code:java}
protocol Test {
record A {
int amount;
}
record B {
int amount;
}
record C {
// old:
union { A } c;
// new:
union { A, B} c;
}
}
{code}
The old C schema has "union \{A} c;", new C schema has "union \{A, B} c;",
suppose we use new C schema and write a B object for field "c", and use old C
schema to read it back, AVRO will happily return a "A" object for field "c",
this is surprising.
The new and old schema are mutual compatible according to AVRO schema validator.
Attached a maven project to demonstrate the issue, here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema
unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}
was:
{code:java}
protocol Test {
record A {
int amount;
}
record B {
int amount;
}
record C {
// old:
union { A } c;
// new:
union { A, B} c;
}
}
{code}
The old C schema has "union {A} c;", new C schema has "union {A, B} c;",
suppose we use new C schema and write a B object for field "c", and use old C
schema to read it back, AVRO will happily return a "A" object for field "c",
this is surprising.
The new and old schema are mutual compatible according to AVRO schema validator.
Attached a maven project to demonstrate the issue, here is the output:
{code}
readerSchema and writerSchema are mutual compatible
writerSchema unionSchemas.size()=2
writerSchema
unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r={"c": {"amount": 12}}
r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
r2={"c": {"amount": 12}}
r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
{code}
> wrong union schema forward compatibility
> -----------------------------------------
>
> Key: AVRO-2772
> URL: https://issues.apache.org/jira/browse/AVRO-2772
> Project: Apache Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.9.2
> Environment: JDK 13, Maven 3.6.3, avro 1.9.2.
> Reporter: Yubao Liu
> Priority: Major
> Attachments: avro-union.tar.gz
>
>
> {code:java}
> protocol Test {
> record A {
> int amount;
> }
> record B {
> int amount;
> }
> record C {
> // old:
> union { A } c;
>
> // new:
> union { A, B} c;
> }
> }
> {code}
> The old C schema has "union \{A} c;", new C schema has "union \{A, B} c;",
> suppose we use new C schema and write a B object for field "c", and use
> old C schema to read it back, AVRO will happily return a "A" object for
> field "c", this is surprising.
> The new and old schema are mutual compatible according to AVRO schema
> validator.
> Attached a maven project to demonstrate the issue, here is the output:
> {code}
> readerSchema and writerSchema are mutual compatible
> writerSchema unionSchemas.size()=2
> writerSchema
> unionSchemas.get(1)={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r={"c": {"amount": 12}}
> r.c.schema={"type":"record","name":"B","fields":[{"name":"amount","type":"int"}]}
> r2={"c": {"amount": 12}}
> r2.c.schema={"type":"record","name":"A","fields":[{"name":"amount","type":"int"}]}
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)