[jira] [Commented] (AVRO-1661) Schema Evolution not working

Ryan Blue (JIRA) Wed, 08 Apr 2015 10:24:46 -0700

    [ 
https://issues.apache.org/jira/browse/AVRO-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14485583#comment-14485583
 ]


Ryan Blue commented on AVRO-1661:
---------------------------------

Those schemas look compatible to me, but there could still be a problem with 
either the data file or how you're reading it. To check a file, can you use 
{{avro-tools}} to cat its contents, but it looks like you're embedding an Avro 
record in a Kafka message so that might not be an option and makes it more 
likely that you have another problem: not setting the writer schema.

If the data is valid and not truncated, you need to set the new schema as the 
read schema and the old schema as the write schema so that Avro will resolve 
the two. Avro can't read bytes written with the old schema as though they were 
written with the new. Instead, Avro plans how to read the data: read string 
field a, read string field b, fill in default "na" on field c. If you were 
using just the new schema, then the plan would be: read string field a, read 
string field b, read string field c. That last string field would be expected 
and would cause the EOF you see.

> Schema Evolution not working 
> -----------------------------
>
>                 Key: AVRO-1661
>                 URL: https://issues.apache.org/jira/browse/AVRO-1661
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.6, 1.7.7
>         Environment: Ubuntu 14.10
>            Reporter: Nicolas PHUNG
>            Priority: Critical
>              Labels: avsc, evolution, schema
>
> This is the Avro Schema (OLD) I was using to write Avro binary data before:
> {noformat}
> {
>     "namespace": "com.hello.world",
>     "type": "record",
>     "name": "Toto",
>     "fields": [
>         {
>             "name": "a",
>             "type": [
>                 "string",
>                 "null"
>             ]
>         },
>         {
>             "name": "b",
>             "type": "string"
>         }
>     ]
> }
> {noformat}
> This is the Avro Schema (NEW) I'm using to read the Avro binary data :
> {noformat}
> {
>     "namespace": "com.hello.world",
>     "type": "record",
>     "name": "Toto",
>     "fields": [
>         {
>             "name": "a",
>             "type": [
>                 "string",
>                 "null"
>             ]
>         },
>         {
>             "name": "b",
>             "type": "string"
>         },
>         {
>             "name": "c",
>             "type": "string",
>             "default": "na"
>         }
>     ]
> }
> {noformat}
> However, I can't read the old data with the new Schema. I've got the 
> following errors :
> {noformat}
> 15/04/08 17:32:22 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 
> (TID 3)
> java.io.EOFException
>       at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473)
>       at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128)
>       at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259)
>       at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:272)
>       at 
> org.apache.avro.io.ValidatingDecoder.readString(ValidatingDecoder.java:113)
>       at 
> org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:353)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157)
>       at 
> org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
>       at 
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151)
>       at 
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
>       at com.miguno.kafka.avro.AvroDecoder.fromBytes(AvroDecoder.scala:31)
> {noformat}
> From my understanding, I should be able to read the old data with the new 
> schema that contains a new field with a default value. But it doesn't seem to 
> work. Am I doing something wrong ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (AVRO-1661) Schema Evolution not working

Reply via email to