[ 
https://issues.apache.org/jira/browse/AVRO-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451372#comment-17451372
 ] 

Uwe Eisele commented on AVRO-3235:
----------------------------------

Hello again,

I just saw that there is already a pull request that solves this problem and 
changes the check for enums from FullName to Name: 
[https://github.com/apache/avro/pull/1381/files]

 

Regards,

Uwe

> Avro Schema Evolution with Enum – Deserialization Crashes
> ---------------------------------------------------------
>
>                 Key: AVRO-3235
>                 URL: https://issues.apache.org/jira/browse/AVRO-3235
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.10.2
>            Reporter: Bertram Beyer
>            Priority: Major
>
> Originally posted on Stack Overflow in June 2020
> https://stackoverflow.com/questions/62596990/avro-schema-evolution-with-enum-deserialization-crashes/
>  
> I defined two versions of a record in two separate AVCS schema files. I used 
> the namespace to distinguish versions *SimpleV1.avsc*
>   
> {code:json}
> {
>   "type" : "record",
>   "name" : "Simple",
>   "namespace" : "test.simple.v1",
>   "fields" : [ 
>       {
>         "name" : "name",
>         "type" : "string"
>       }, 
>       {
>         "name" : "status",
>         "type" : {
>           "type" : "enum",
>           "name" : "Status",
>           "symbols" : [ "ON", "OFF" ]
>         },
>         "default" : "ON"
>       }
>    ]
> }
> {code}
>  
> *Example JSON*
>   
> {code:java}
> {"name":"A","status":"ON"}
> {code}
> Version 2 just has an additional description field with default value.
> *SimpleV2.avsc*
>   
> {code:java}
> {
>   "type" : "record",
>   "name" : "Simple",
>   "namespace" : "test.simple.v2",
>   "fields" : [ 
>       {
>         "name" : "name",
>         "type" : "string"
>       }, 
>       {
>         "name" : "description",
>         "type" : "string",
>         "default" : ""
>       }, 
>       {
>         "name" : "status",
>         "type" : {
>           "type" : "enum",
>           "name" : "Status",
>           "symbols" : [ "ON", "OFF" ]
>         },
>         "default" : "ON"
>       }
>    ]
> }
> {code}
> *Example JSON*
>   
> {code:java}
> {"name":"B","description":"b","status":"ON"}
> {code}
> Both schemas were serialized to Java classes. In my example I was going to 
> test backward compatibility. A record written by V1 shall be read by a reader 
> using V2. I wanted to see that default values are inserted. This is working 
> as long as I do not use enums.
>   
> {code:java}
> public class EnumEvolutionExample {
>     public static void main(String[] args) throws IOException {
>         Schema schemaV1 = new org.apache.avro.Schema.Parser().parse(new 
> File("./src/main/resources/SimpleV1.avsc"));
>         //works as well
>         //Schema schemaV1 = test.simple.v1.Simple.getClassSchema();
>         Schema schemaV2 = new org.apache.avro.Schema.Parser().parse(new 
> File("./src/main/resources/SimpleV2.avsc"));
>         test.simple.v1.Simple simpleV1 = test.simple.v1.Simple.newBuilder()
>                 .setName("A")
>                 .setStatus(test.simple.v1.Status.ON)
>                 .build();
>         
>         
>         SchemaPairCompatibility schemaCompatibility = 
> SchemaCompatibility.checkReaderWriterCompatibility(
>                 schemaV2,
>                 schemaV1);
>         //Checks that writing v1 and reading v2 schemas is compatible
>         Assert.assertEquals(SchemaCompatibilityType.COMPATIBLE, 
> schemaCompatibility.getType());
>         
>         byte[] binaryV1 = serealizeBinary(simpleV1);
>         
>         //Crashes with: AvroTypeException: Found test.simple.v1.Status, 
> expecting test.simple.v2.Status
>         test.simple.v2.Simple v2 = deSerealizeBinary(binaryV1, new 
> test.simple.v2.Simple(), schemaV1);
>         
>     }
>     
>     public static byte[] serealizeBinary(SpecificRecord record) {
>         DatumWriter<SpecificRecord> writer = new 
> SpecificDatumWriter<>(record.getSchema());
>         byte[] data = new byte[0];
>         ByteArrayOutputStream stream = new ByteArrayOutputStream();
>         Encoder binaryEncoder = EncoderFactory.get()
>             .binaryEncoder(stream, null);
>         try {
>             writer.write(record, binaryEncoder);
>             binaryEncoder.flush();
>             data = stream.toByteArray();
>         } catch (IOException e) {
>             System.out.println("Serialization error " + e.getMessage());
>         }
>         return data;
>     }
>     
>     public static <T extends SpecificRecord> T deSerealizeBinary(byte[] data, 
> T reuse, Schema writer) {
>         Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);
>         DatumReader<T> datumReader = new SpecificDatumReader<>(writer, 
> reuse.getSchema());
>         try {
>             T datum = datumReader.read(null, decoder);
>             return datum;
>         } catch (IOException e) {
>             System.out.println("Deserialization error" + e.getMessage());
>         }
>         return null;
>     }
> }
> {code}
> The checkReaderWriterCompatibility method confirms that schemas are 
> compatible. But when I deserialize I’m getting the following exception
>  
> {code:java}
> Exception in thread "main" org.apache.avro.AvroTypeException: Found 
> test.simple.v1.Status, expecting test.simple.v2.Status
>     at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:309)
>     at org.apache.avro.io.parsing.Parser.advance(Parser.java:86)
>     at org.apache.avro.io.ResolvingDecoder.readEnum(ResolvingDecoder.java:260)
>     at 
> org.apache.avro.generic.GenericDatumReader.readEnum(GenericDatumReader.java:267)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:181)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readField(SpecificDatumReader.java:136)
>     at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:247)
>     at 
> org.apache.avro.specific.SpecificDatumReader.readRecord(SpecificDatumReader.java:123)
>     at 
> org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:160)
>     at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>     at 
> test.EnumEvolutionExample.deSerealizeBinary(EnumEvolutionExample.java:70)
>     at test.EnumEvolutionExample.main(EnumEvolutionExample.java:45)
> {code}
>  
> I don’t understand why Avro thinks it got a v1.Status. Namespaces are not 
> part of the encoding.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to