One approach you could use: instead of a union, make a separate field for every 
possible type of message, and make every field a union with null (with default 
value null). Then only fill in the field for the corresponding message type. If 
you do this, a reader using an old version of the schema will simply see all 
fields as null (rather than an exception) if it encounters an unknown message 
type.

Another possibility: you can always use the writer schema to decode the data, 
and use the "generic" (dynamically typed) interface for accessing the data. In 
that case, schema evolution is handled by the application code.

Putting binary Avro blobs in the database is absolutely fine, as long as you 
attach a schema version to every blob (so that you know the writer schema with 
which it was encoded). You can keep the schemas in a separate database table.

Martin

> On 15 Dec 2015, at 16:38, HILEM Youcef <youcef.hi...@laposte.fr> wrote:
> 
> Hi Martin,
>  
> Thank you for your clear answer.
> I will test the example you provide.
> In this case it is strongly not recommended to use binary avro as a blob in a 
> database.
> It is very difficult if not impossible to deserialize with a single reader 
> all lines.
> Best regards.
> Youcef.
>  
> De : Martin Kleppmann [mailto:mar...@kleppmann.com] 
> Envoyé : lundi 14 décembre 2015 22:46
> À : <user@avro.apache.org>
> Objet : Re: add a type to a union
>  
> Hi Youcef,
>  
> Glad you found my old blog post on Avro schema evolution :)
>  
> I encourage you to try a simple example, which will make it clearer: 
> https://gist.github.com/ept/5fd7c625969248b31e73 
> <https://gist.github.com/ept/5fd7c625969248b31e73>
>  
> In this example, the writer has a union of null, string and long, whereas the 
> reader only has a union of null and string. A default value of null is set. 
> If the record has a null or string value, it is correctly parsed by the 
> reader. If the record has a long value, the reader throws an exception, 
> because it is not one of the union datatypes it is expecting.
>  
> So the default value unfortunately doesn't help here. If you want to add a 
> new branch to a union schema, you have to make sure that all the readers are 
> updated with the new schema first, and only then should writers start 
> generating data with the new schema.
>  
> Hope that helps.
> Martin
>  
>  
> On 7 Dec 2015, at 22:15, HILEM Youcef <youcef.hi...@laposte.fr 
> <mailto:youcef.hi...@laposte.fr>> wrote:
>  
> Hi,
>  
> At La Poste Pôle Colis we use Avro in our new reactive architecture (kafka, 
> spark streaming, Cassandra, elasticsearch, play framework).
>  
> In our modeling we used the type union to bring together in one schema all 
> trace events of a package (arrival, departure, transportation, ...) at the 
> body attribute.
>  
> Example :
> {
> "namespace" : "fr.laposte.colis.schema.pivot.message",
> "name" : "Message",
> "type" : "record",
> "doc" : "Cette structure défini les caractéristiques de base d'un message. 
> Peut(doit) être spécialisée pour un usage particulier",
>                                 "fields" : [ 
>                                                {
>                                                                "name" : 
> "header",
>                                                                "type" : 
> "fr.laposte.colis.schema.pivot.common.message.MessageHeader",
>                                                                "doc" :  
> "Entête du message"
>                                                },{
>                                                                "name" : 
> "body",
>                                                                "type" : 
> ["fr.laposte.colis.schema.pivot.announcement.AnnouncementEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.delivery.DeliveryEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.handling.HandlingEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.crm.CrmEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.transport.CustomsTransportMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.consignment.CustomsContainerEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.customs.consignment.CustomsParcelEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.rest.common.Rest",
>                                                                               
> "fr.laposte.colis.schema.pivot.reject.RejectMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.dpmo.defectrequest.DefectRequestEventMessageBody",
>                                                                               
> "fr.laposte.colis.schema.pivot.dpmo.defectresult.DefectResultEventMessageBody",
>                                                                               
>  "fr.laposte.colis.schema.timeout.TimeoutMessageBody",
>                                                                               
>  "fr.laposte.colis.schema.notification.Notification"
>                                                                               
> ],
>                                                                "doc" :  
> "Abstraction du corps de message. Peut-être substitué par tout type dérivé du 
> type MessageBody"
>                                                } 
>                                 ]
> }
>  
> However, as well explained at 
> (https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
>  
> <https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html>)
>  : “Union types are powerful, but you must take care when changing them. If 
> you want to add a type to a union, you first need to update all readers with 
> the new schema, so that they know what to expect. Only once all readers are 
> updated, the writers may start putting this new type in the records they 
> generate”
>  
> My question : is a default value for field “body” is sufficient so that if 
> the reader encounters a union branch it does not know about, it can 
> substitute the default value (see 
> http://grokbase.com/t/avro/user/11b3bn6r6z/does-extending-union-break-compatibility
>  
> <http://grokbase.com/t/avro/user/11b3bn6r6z/does-extending-union-break-compatibility>)
>  ?
>  
> Thank you in advance for your help.
>  
> 
> Post-scriptum La Poste
> 
> Ce message est confidentiel. Sous reserve de tout accord conclu par
> ecrit entre vous et La Poste, son contenu ne represente en aucun cas un
> engagement de la part de La Poste. Toute publication, utilisation ou
> diffusion, meme partielle, doit etre autorisee prealablement. Si vous
> n'etes pas destinataire de ce message, merci d'en avertir immediatement
> l'expediteur.
> 

Reply via email to