[ 
https://issues.apache.org/jira/browse/AVRO-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15491022#comment-15491022
 ] 

Ryan Blue commented on AVRO-1915:
---------------------------------

These schemas do look compatible to me. Usually when you have trouble reading 
in these cases, the problem is that you haven't passed both the writer's schema 
and the reader's schema to the datum reader. The writer's schema is needed to 
know what fields to expect. When you construct your datum reader, pass both 
schemas, like this:

{code:lang=java}
DatumReader<GenericRecord> reader = 
GenericData.get().createDatumReader(writerSchema, readerSchema);
{code}

Also, the 1.8.2 release is coming up and we've added some support for this use 
case. In the upcoming release, you'll be able to do this:

{code:lang=java}
MessageEncoder<Record> v1Encoder = new 
BinaryMessageEncoder<Record>(GenericData.get(), SCHEMA_V1);
BinaryMessageDecoder<Record> v2Decoder = new 
BinaryMessageDecoder<Record>(GenericData.get(), SCHEMA_V2);
// add the older version to the decoder so it can handle both v1 and v2 
messages (you can do this with as many as you need)
v2Decoder.addSchema(SCHEMA_V1);

// encode the v1 record (on the producer side)
ByteBuffer v1Buffer = v1Encoder.encode(v1record);

// decode the v1 record to the expected v2 schema (on the consumer side)
Record v2record = v2Decoder.decode(v1Buffer);
{code}

> AvroTypeException decoding from earlier schema version
> ------------------------------------------------------
>
>                 Key: AVRO-1915
>                 URL: https://issues.apache.org/jira/browse/AVRO-1915
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.7.7
>            Reporter: NPE
>
> We have two services which communicate with one another by sending 
> JSON-encoded Avro-based messages over Kafka.  We want to update the schema 
> for messages sent from service A to service B by adding an additional string 
> field with a default value of "" (empty string).  We have tested by initially 
> adding the updated schema to service B (the reader) and continuing to send 
> messages in the older format from service A (the writer).
> Simplified example of old schema (some fields omitted):
> {code}
> {
>       "type": "record",
>       "name": "Envelope",
>       "fields": [{
>               "name": "appId",
>               "type": "string"
>       }, {
>               "name": "time",
>               "type": "long"
>       }, {
>               "name": "type",
>               "type": "string"
>       }, {
>               "name": "payload",
>               "type": [{
>                       "type": "record",
>                       "name": "MessagePayload",
>                       "fields": [{
>                               "name": "context",
>                               "type": {
>                                       "type": "record",
>                                       "name": "PayloadContext",
>                                       "fields": [{
>                                               "name": "source",
>                                               "type": "string"
>                                       }, {
>                                               "name": "requestId",
>                                               "type": "string"
>                                       }]
>                               }
>                       }, {
>                               "name": "content",
>                               "type": "string"
>                       }, {
>                               "name": "contentType",
>                               "type": "string"
>                       }]
>               }]
>       }]
> }
> {code}
> Simplified example of new schema (some fields omitted):
> {code}
> {
>       "type": "record",
>       "name": "Envelope",
>       "fields": [{
>               "name": "appId",
>               "type": "string"
>       }, {
>               "name": "time",
>               "type": "long"
>       }, {
>               "name": "type",
>               "type": "string"
>       }, {
>               "name": "payload",
>               "type": [{
>                       "type": "record",
>                       "name": "MessagePayload",
>                       "fields": [{
>                               "name": "context",
>                               "type": {
>                                       "type": "record",
>                                       "name": "PayloadContext",
>                                       "fields": [{
>                                               "name": "source",
>                                               "type": "string"
>                                       }, {
>                                               "name": "requestId",
>                                               "type": "string"
>                                       }, {
>                                               "name": "newField",
>                                               "type": "string",
>                                               "default": ""
>                                       }]
>                               }
>                       }, {
>                               "name": "content",
>                               "type": "string"
>                       }, {
>                               "name": "contentType",
>                               "type": "string"
>                       }]
>               }]
>       }]
> }
> {code}
> Our understanding was that the reader, with the newer schema, should be able 
> to parse messages sent with the older given the default value for the missing 
> field; however, we are getting the following exception:
> {code}
> org.apache.avro.AvroTypeException: Expected string. Got END_OBJECT
> {code}
> Are we missing something here?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to