Just realized I forgot to change the `id` fields to long in the first union (all IDs should be longs). Apologies for the confusion, they don’t matter at all in the example.
> On Jun 11, 2016, at 7:43 PM, Matthieu Monsch <mon...@alum.mit.edu> wrote: > > Happy to provide an example. Let’s assume that we have a Kafka producer > emitting the following values: > union { > record Vehicle { > int id; > }, > record Car { > int id; > boolean selfDriving; > } > } > At a later point in time, a new vehicle becomes supported by the system and > must be added to the schema: > > union { > record Vehicle { > long id; > }, > record Car { > long id; > boolean selfDriving; > }, > @aliases(["Vehicle"]) // Ignored when on the producer's schema. > record Bus { > long id; > int capacity; > } > } > We would like to be able to deploy the change to the producer without having > to migrate all the consumers: existing consumers would treat each Bus as a > Vehicle until they upgrade. > > However we can't do so under the current evolution rules since the alias is > ignored (it would work if we added the alias to each consumer's schema but > this isn't practical since it would also require a global migration). Note > also that we can't preemptively add aliases on the consumers since the names > of the records aren't known beforehand. > > Allowing the consumers (readers) to use the producer's (writer’s) aliases > would fix this. If we make sure that writer aliases are used last (for > example only falling back to them if neither the names nor the consumers' > aliases match), this doesn't change any of the current allowed evolution > rules and expands them to support additional cases (without introducing any > new syntax). > > Does this make sense? > > -Matthieu > > Ps: In case it’s more readable, this example can also be read here: > https://gist.github.com/mtth/527318445e5b52bfd491c0483ff5f9d3 > <https://gist.github.com/mtth/527318445e5b52bfd491c0483ff5f9d3> . > > > >> On Jun 10, 2016, at 2:00 PM, Doug Cutting <cutt...@gmail.com >> <mailto:cutt...@gmail.com>> wrote: >> >> Matthieu, >> >> Can you please provide an example of how this would work? >> >> Thanks, >> >> Doug >> >> On Thu, Jun 9, 2016 at 6:47 PM, Matthieu Monsch <mon...@alum.mit.edu >> <mailto:mon...@alum.mit.edu>> wrote: >> >>> Thinking about this a bit more (and a couple months later…), maybe there >>> is a simpler alternative. >>> >>> Currently, a reason why writer evolution is hard (the union issue >>> described below is a special case of this) is that aliases are only used on >>> the reader side. Why not also allow readers to use the writer’s aliases? >>> >>> Resolution would first be done on names, then fall back to reader aliases, >>> and finally fall back to writer aliases. In the example below, it would be >>> enough to add an alias to the base record inside any new records to have >>> evolution work. >>> >>> -Matthieu >>> >>> >>> >>>> On Apr 22, 2016, at 8:42 AM, Matthieu Monsch <mon...@alum.mit.edu >>>> <mailto:mon...@alum.mit.edu>> >>> wrote: >>>> >>>> The second solution sounds like a great alternative. >>>> >>>> Branch aliases are more straightforward than an implicit order-sensitive >>> policy. They also have the additional benefit of giving users a bit more >>> flexibility: since defaults are specified on the branches’ types, it is >>> possible to have different branches have different defaults inside the same >>> union. There are probably a few edge cases (e.g. allowing multiple such >>> aliases would be useful) but they should be simple to address. >>>> >>>> What would be a good attribute name for this? `baseTypes`? >>>> >>>> -Matthieu >>>> >>>> >>>> >>>>> On Apr 21, 2016, at 10:52 AM, Doug Cutting <cutt...@gmail.com >>>>> <mailto:cutt...@gmail.com>> wrote: >>>>> >>>>> On Wed, Apr 20, 2016 at 9:09 PM, Ryan Blue <rb...@netflix.com.invalid >>>>> <mailto:rb...@netflix.com.invalid>> >>> wrote: >>>>>> Making the default a property of an >>>>>> inner schema makes me think that we will have to deal with multiple >>> schemas >>>>>> with such a label at some point. >>>>> >>>>> On Thu, Apr 21, 2016 at 6:54 AM, Matthieu Monsch <mon...@alum.mit.edu >>>>> <mailto:mon...@alum.mit.edu>> >>> wrote: >>>>>> Delegating default selection to the branches themselves is a great >>> idea but it >>>>>> will be tricky to handle reference branches smoothly. More minor but >>> it also >>>>>> doesn’t feel intuitive to not have the union “own” its default >>> attribute. >>>>> >>>>> If I understand your concerns correctly, I attempted to address this >>> above: >>>>> >>>>> "Note however that, when using a record as the default branch, one >>>>> could not then >>>>> use that same record as a non-default branch in another union. To >>>>> ameliorate that, we might permit multiple default branches in a union >>>>> to be specified as default with the convention that the first such is >>>>> used." >>>>> >>>>> Does that make sense? >>>>> >>>>> This isn't ideal syntax, but it's not terrible, and it doesn't change >>>>> schema syntax incompatibly, which seems important, especially when its >>>>> unlikely that all implementations would implement such a syntax change >>>>> in a synchronized manner. >>>>> >>>>> Alternately, one might annotate each derived record with the name of >>>>> its base record, then one wouldn't need to alter union definitions. >>>>> This would work like an alias. If a record doesn't exist in the >>>>> reader's schema, then an alias to the missing record would be added in >>>>> the reader's schema to the base record it names in the writer's >>>>> schema. Aliases work by rewriting the writer's schema at read-time, >>>>> updating names, including those in unions. Might that work? It seems >>>>> like perhaps a more elegant approach. It has compatible syntax and >>>>> only alters behavior of a case that fails today. >>>>> >>>>> Doug >>>> >>> >>> >