[ 
https://issues.apache.org/jira/browse/BEAM-10759?focusedWorklogId=473243&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473243
 ]

ASF GitHub Bot logged work on BEAM-10759:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Aug/20 07:57
            Start Date: 21/Aug/20 07:57
    Worklog Time Spent: 10m 
      Work Description: dennisylyung commented on pull request #12630:
URL: https://github.com/apache/beam/pull/12630#issuecomment-678101761


   @iemejia I have pushed a new commit implementing your suggestions.
   For the specific test, I edited the mock consumer in KafkaIOTest to generate 
records with a different schema. 
   I have verified the new deserializer provider by running the modified test 
on the original one
   ```java
     @Override
     public Deserializer<T> getDeserializer(Map<String, ?> configs, boolean 
isKey) {
       ImmutableMap<String, Object> csrConfig =
           ImmutableMap.<String, Object>builder()
               .putAll(configs)
               .put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, 
schemaRegistryUrl)
               .build();
       Deserializer<T> deserializer =
           (Deserializer<T>) new 
KafkaAvroDeserializer(getSchemaRegistryClient());
   //            new 
ConfluentSchemaRegistryDeserializer(getSchemaRegistryClient(), getAvroSchema());
       deserializer.configure(csrConfig, isKey);
       return deserializer;
     }
   ```
   As expected, this will cause the test to fail


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 473243)
    Time Spent: 40m  (was: 0.5h)

> KafkaIO with Avro deserializer fails with evolved schema
> --------------------------------------------------------
>
>                 Key: BEAM-10759
>                 URL: https://issues.apache.org/jira/browse/BEAM-10759
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-kafka
>    Affects Versions: 2.23.0
>            Reporter: Dennis Yung
>            Assignee: Dennis Yung
>            Priority: P2
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> When using KafkaIO with ConfluentSchemaRegistryDeserializerProvider, 
> exception could be thrown when consuming a topic with evolved schema.
> It is because when the DeserializerProvider is initialized, it create a 
> AvroCoder instance using either the latest Avro schema by default, or a 
> specific version of provided.
> If the Kafka topic contains records with multiple schema versions, AvroCoder 
> will fail to encode records with different schemas. The specific exception 
> differs depending on the schema change. For example, I have encountered type 
> cast error and null pointer error. 
> To fix this issue, we can make use of the writer-reader schema arguments from 
> Avro to deserialize Kafka records to the same schema with the AvroCoder. The 
> method is available in io.confluent.kafka.serializers.KafkaAvroDeserializer
> {code:java}
>     public Object deserialize(String s, byte[] bytes, Schema readerSchema) {
>         return this.deserialize(bytes, readerSchema);
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to