Hi Youcef,

Glad you found my old blog post on Avro schema evolution :)

I encourage you to try a simple example, which will make it clearer: 
https://gist.github.com/ept/5fd7c625969248b31e73 
<https://gist.github.com/ept/5fd7c625969248b31e73>

In this example, the writer has a union of null, string and long, whereas the 
reader only has a union of null and string. A default value of null is set. If 
the record has a null or string value, it is correctly parsed by the reader. If 
the record has a long value, the reader throws an exception, because it is not 
one of the union datatypes it is expecting.

So the default value unfortunately doesn't help here. If you want to add a new 
branch to a union schema, you have to make sure that all the readers are 
updated with the new schema first, and only then should writers start 
generating data with the new schema.

Hope that helps.
Martin


> On 7 Dec 2015, at 22:15, HILEM Youcef <[email protected]> wrote:
> 
> Hi,
>  
> At La Poste Pôle Colis we use Avro in our new reactive architecture (kafka, 
> spark streaming, Cassandra, elasticsearch, play framework).
>  
> In our modeling we used the type union to bring together in one schema all 
> trace events of a package (arrival, departure, transportation, ...) at the 
> body attribute.
>  
> Example :
> {
> "namespace" : "fr.laposte.colis.schema.pivot.message",
> "name" : "Message",
> "type" : "record",
> "doc" : "Cette structure défini les caractéristiques de base d'un message. 
> Peut(doit) être spécialisée pour un usage particulier",
>                                 "fields" : [ 
>                                                {
>                                                                "name" : 
> "header",
>                                                                "type" : 
> "fr.laposte.colis.schema.pivot.common.message.MessageHeader",
>                                                                "doc" :  
> "Entête du message"
>                                                },{
>                                                                "name" : 
> "body",
>                                                                "type" : 
> ["fr.laposte.colis.schema.pivot.announcement.AnnouncementEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.delivery.DeliveryEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.handling.HandlingEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.crm.CrmEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.transport.CustomsTransportMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.consignment.CustomsContainerEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.consignment.CustomsParcelEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.rest.common.Rest",
>                                                                               
> "fr.laposte.colis.schema.pivot.reject.RejectMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.dpmo.defectrequest.DefectRequestEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.dpmo.defectresult.DefectResultEventMessageBody",
>                                                                               
>  "fr.laposte.colis.schema.timeout.TimeoutMessageBody",
>                                                                               
>  "fr.laposte.colis.schema.notification.Notification"
>                                                                               
> ],
>                                                                "doc" :  
> "Abstraction du corps de message. Peut-être substitué par tout type dérivé du 
> type MessageBody"
>                                                } 
>                                 ]
> }
>  
> However, as well explained at 
> (https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
>  
> <https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html>)
>  : “Union types are powerful, but you must take care when changing them. If 
> you want to add a type to a union, you first need to update all readers with 
> the new schema, so that they know what to expect. Only once all readers are 
> updated, the writers may start putting this new type in the records they 
> generate”
>  
> My question : is a default value for field “body” is sufficient so that if 
> the reader encounters a union branch it does not know about, it can 
> substitute the default value (see 
> http://grokbase.com/t/avro/user/11b3bn6r6z/does-extending-union-break-compatibility
>  
> <http://grokbase.com/t/avro/user/11b3bn6r6z/does-extending-union-break-compatibility>)
>  ?
>  
> Thank you in advance for your help.

Reply via email to