[
https://issues.apache.org/jira/browse/AVRO-3027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17585722#comment-17585722
]
Yevhen Cherkes commented on AVRO-3027:
--------------------------------------
[~kniemitalo],
The problem occurs in a heterogeneous environment. The problem is in the
cross-language consistency of consumers. So, I think, the best solution is to
remove an undocumented match algorithm from the java-based consumer. But if
it's impossible because of backward compatibility - I think it must be included
in specifications and implemented in every language.
I just would like to give some context, on what's happened:
We have N teams and a centralized schema repository.
Team #1 works with the Java stack. It creates the Schema v1, adds this v1 to
the repository, and produces messages in Kafka.
Then they are changing the Schema to v2, checking the backward compatibility
with built-in java based tools. Nothing indicates incompatibility. They started
producing the messages with Schema v2. All works fine for Java-based consumers.
Team #2 works with the .NET stack. It consumes the messages with the Shema1,
but when the message with Shema v2 appears in Kafka, they run into runtime
trouble.
Of course, Team2 can create an MR/PR to add the aliases to the existing Shema,
but why?
Team #1 is the creator and owner of the Schema.
> .net consumer Backward compatibility issue.
> --------------------------------------------
>
> Key: AVRO-3027
> URL: https://issues.apache.org/jira/browse/AVRO-3027
> Project: Apache Avro
> Issue Type: Bug
> Components: csharp, java
> Affects Versions: 1.10.1
> Environment: .Net core 3.1
> Reporter: Yevhen Cherkes
> Priority: Critical
> Labels: pull-request-available
> Attachments: CompatibilityTests.zip
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> We are facing the following exception on the .net core consumer's side:
> (*Java*-based consumers work with no issues for this particular case.)
> {code:java}
> No matching schema for
> {"type":"record","name":"InnerType1","namespace":"com.mycorp.mynamespace","fields":[
> {"name":"value","type":"string"}
> ]} in
> {"type":["null",{"type":"record","name":"InnerType2","namespace":"com.mycorp.mynamespace","fields":[
> {"name":"value","type":"string"}
> ]}]}
> at Avro.Generic.DefaultReader.findBranch(UnionSchema us, Schema s)
> at Avro.Generic.DefaultReader.ReadUnion(Object reuse, UnionSchema
> writerSchema, Schema readerSchema, Decoder d)
> at Avro.Generic.DefaultReader.Read(Object reuse, Schema writerSchema, Schema
> readerSchema, Decoder d)
> at Avro.Specific.SpecificDefaultReader.ReadRecord(Object reuse, RecordSchema
> writerSchema, Schema readerSchema, Decoder dec)
> {code}
> We have 2 versions of the schema on a topic:
> Schema v1:
> {code:json}
> {
> "fields": [
> {
> "name": "InnerComplexValue",
> "type": [
> "null",
> {
> "fields": [
> {
> "name": "value",
> "type": "string"
> }
> ],
> "name": "InnerType1",
> "namespace": "com.mycorp.mynamespace",
> "type": "record"
> }
> ]
> }
> ],
> "name": "RootType",
> "namespace": "com.mycorp.mynamespace",
> "type": "record"
> }
> {code}
> Schema v2:
> {code:json}
> {
> "fields": [
> {
> "name": "InnerComplexValue",
> "type": [
> "null",
> {
> "fields": [
> {
> "name": "value",
> "type": "string"
> }
> ],
> "name": "InnerType2",
> "namespace": "com.mycorp.mynamespace",
> "type": "record"
> }
> ]
> }
> ],
> "name": "RootType",
> "namespace": "com.mycorp.mynamespace",
> "type": "record"
> }
> {code}
> InnerType1 -> InnerType2 is the only change.
> The schema for a topic is configured with *Backward* compatibility.
> An updated version of a Schema is saved with no errors.
> Then we generated a specific record RootType with avrogen tool.
> Reproduction code:
>
> {code:c#}
> var base64 = "AhR0ZXN0IHZhbHVl";
> using var stream = new MemoryStream(Convert.FromBase64String(base64));
> var schema_v1 =
> Schema.Parse("{\"type\":\"record\",\"name\":\"RootType\",\"namespace\":\"com.mycorp.mynamespace\",\"fields\":"
> +
>
> "[{\"name\":\"InnerComplexValue\",\"type\":[\"null\",{\"type\":\"record\",\"name\":\"InnerType1\""
> +
>
> ",\"fields\":[{\"name\":\"value\",\"type\":\"string\"}]}]}]}");
> var schema_v2 =
> Schema.Parse("{\"type\":\"record\",\"name\":\"RootType\",\"namespace\":\"com.mycorp.mynamespace\",\"fields\":"
> +
>
> "[{\"name\":\"InnerComplexValue\",\"type\":[\"null\",{\"type\":\"record\",\"name\":\"InnerType2\""
> +
>
> ",\"fields\":[{\"name\":\"value\",\"type\":\"string\"}]}]}]}");
> var specificRecord = Deserialize<RootType>(stream, schema_v1, schema_v2);
> {code}
> Full sources see in attachments.
> Method Deserialize was taken from the
> [SpecificTests.cs#L408-L417|https://github.com/apache/avro/blob/master/lang/csharp/src/apache/test/Specific/SpecificTests.cs#L408-L417]
>
> Probably, the reason is in a different Schema resolution code:
> java:
> [https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/Resolver.java#L647-L659]
> .net:
> [https://github.com/apache/avro/blob/master/lang/csharp/src/apache/main/Schema/RecordSchema.cs#L310-L312]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)