Jurgis Pods created KAFKA-9744:
----------------------------------

             Summary: SchemaProjector fails to handle backwards-compatible 
schema change
                 Key: KAFKA-9744
                 URL: https://issues.apache.org/jira/browse/KAFKA-9744
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
    Affects Versions: 2.3.1
            Reporter: Jurgis Pods


_Note_: This bug report is for CP 5.3.1 / Kafka 2.3.1, but it most likely 
affects all versions.

We recently made a number of backwards-compatible changes to our Avro schemas 
in the Confluent Schema Registry. Those changes were accepted as 
backwards-compatible. However, when redeploying Kafka S3 connectors consuming 
the relevant topics, we noticed two separate instances of failures in 
SchemaProjector.project(), causing the connectors to crash und stop producing 
data:

1) Changed namespace of record:
{code:java}
org.apache.kafka.connect.errors.SchemaProjectorException: Schema name mismatch. 
source name: my.example.Record and target name: my.example.sub.Record {code}
A change of a record's namespace is compatible according to the Schema 
Registry, but not for the Connect API. I would argue that the namespace/package 
name should not affect compatibility, as it says nothing about the contained 
data and its schema.

2) Change of type from 1-element union to primitive field:
{code:java}
Schema type mismatch. source type: STRUCT and target type: STRING {code}
This happened when changing the corresponding field's Avro schema from
{code:java}
 name": "myfield", "type": ["string"] {code}
to
{code:java}
 name": "myfield", "type": "string"{code}
In this case, I am less convinced that those two schemas should be compatible 
(they are semantically identical - however, a Union is not a String). But it is 
unfortunate that the Schema Registry sees the above change as compatible, while 
the Connect API does not.

*Summary*:
We made two Avro schema changes which were accepted as compatible by the Schema 
Registry, but were rejected at runtime by the Kafka S3 connectors. Would it be 
possible to have a more consistent (and less restrictive) check in Connect API, 
so that a schema change in the producer can be made more confidently, without 
fear of breaking the consuming connectors?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to