Matthieu,

Thanks for the example.

First, is this really an alias, or is it something else?  In other
words, would a reader ever map a written Vehicle to a Bus?  If the use
cases are exclusive, perhaps we should call it something different
rather than overload the alias concept?

Second, would the alias implementation, rewriting the writer's schema,
work here?  It would result in a union with two, different, Vehicle
records.  That could probably be made to work, but any other
references to the Vehicle schema might become ambiguous.  I suspect
the implementation may end up being quite different.

Aliases currently mean, "formerly known as", this feature seems more
like, "a kind of".

Doug

On Sat, Jun 11, 2016 at 7:43 PM, Matthieu Monsch <mon...@alum.mit.edu> wrote:
> Happy to provide an example. Let’s assume that we have a Kafka producer 
> emitting the following values:
> union {
>   record Vehicle {
>     int id;
>   },
>   record Car {
>     int id;
>     boolean selfDriving;
>   }
> }
> At a later point in time, a new vehicle becomes supported by the system and 
> must be added to the schema:
>
> union {
>   record Vehicle {
>     long id;
>   },
>   record Car {
>     long id;
>     boolean selfDriving;
>   },
>   @aliases(["Vehicle"]) // Ignored when on the producer's schema.
>   record Bus {
>     long id;
>     int capacity;
>   }
> }
> We would like to be able to deploy the change to the producer without having 
> to migrate all the consumers: existing consumers would treat each Bus as a 
> Vehicle until they upgrade.
>
> However we can't do so under the current evolution rules since the alias is 
> ignored (it would work if we added the alias to each consumer's schema but 
> this isn't practical since it would also require a global migration). Note 
> also that we can't preemptively add aliases on the consumers since the names 
> of the records aren't known beforehand.
>
> Allowing the consumers (readers) to use the producer's (writer’s) aliases 
> would fix this. If we make sure that writer aliases are used last (for 
> example only falling back to them if neither the names nor the consumers' 
> aliases match), this doesn't change any of the current allowed evolution 
> rules and expands them to support additional cases (without introducing any 
> new syntax).
>
> Does this make sense?
>
> -Matthieu
>
> Ps: In case it’s more readable, this example can also be read here: 
> https://gist.github.com/mtth/527318445e5b52bfd491c0483ff5f9d3 
> <https://gist.github.com/mtth/527318445e5b52bfd491c0483ff5f9d3> .
>
>
>
>> On Jun 10, 2016, at 2:00 PM, Doug Cutting <cutt...@gmail.com> wrote:
>>
>> Matthieu,
>>
>> Can you please provide an example of how this would work?
>>
>> Thanks,
>>
>> Doug
>>
>> On Thu, Jun 9, 2016 at 6:47 PM, Matthieu Monsch <mon...@alum.mit.edu> wrote:
>>
>>> Thinking about this a bit more (and a couple months later…), maybe there
>>> is a simpler alternative.
>>>
>>> Currently, a reason why writer evolution is hard (the union issue
>>> described below is a special case of this) is that aliases are only used on
>>> the reader side. Why not also allow readers to use the writer’s aliases?
>>>
>>> Resolution would first be done on names, then fall back to reader aliases,
>>> and finally fall back to writer aliases. In the example below, it would be
>>> enough to add an alias to the base record inside any new records to have
>>> evolution work.
>>>
>>> -Matthieu
>>>
>>>
>>>
>>>> On Apr 22, 2016, at 8:42 AM, Matthieu Monsch <mon...@alum.mit.edu>
>>> wrote:
>>>>
>>>> The second solution sounds like a great alternative.
>>>>
>>>> Branch aliases are more straightforward than an implicit order-sensitive
>>> policy. They also have the additional benefit of giving users a bit more
>>> flexibility: since defaults are specified on the branches’ types, it is
>>> possible to have different branches have different defaults inside the same
>>> union. There are probably a few edge cases (e.g. allowing multiple such
>>> aliases would be useful) but they should be simple to address.
>>>>
>>>> What would be a good attribute name for this? `baseTypes`?
>>>>
>>>> -Matthieu
>>>>
>>>>
>>>>
>>>>> On Apr 21, 2016, at 10:52 AM, Doug Cutting <cutt...@gmail.com> wrote:
>>>>>
>>>>> On Wed, Apr 20, 2016 at 9:09 PM, Ryan Blue <rb...@netflix.com.invalid>
>>> wrote:
>>>>>> Making the default a property of an
>>>>>> inner schema makes me think that we will have to deal with multiple
>>> schemas
>>>>>> with such a label at some point.
>>>>>
>>>>> On Thu, Apr 21, 2016 at 6:54 AM, Matthieu Monsch <mon...@alum.mit.edu>
>>> wrote:
>>>>>> Delegating default selection to the branches themselves is a great
>>> idea but it
>>>>>> will be tricky to handle reference branches smoothly. More minor but
>>> it also
>>>>>> doesn’t feel intuitive to not have the union “own” its default
>>> attribute.
>>>>>
>>>>> If I understand your concerns correctly, I attempted to address this
>>> above:
>>>>>
>>>>> "Note however that, when using a record as the default branch, one
>>>>> could not then
>>>>> use that same record as a non-default branch in another union.  To
>>>>> ameliorate that, we might permit multiple default branches in a union
>>>>> to be specified as default with the convention that the first such is
>>>>> used."
>>>>>
>>>>> Does that make sense?
>>>>>
>>>>> This isn't ideal syntax, but it's not terrible, and it doesn't change
>>>>> schema syntax incompatibly, which seems important, especially when its
>>>>> unlikely that all implementations would implement such a syntax change
>>>>> in a synchronized manner.
>>>>>
>>>>> Alternately, one might annotate each derived record with the name of
>>>>> its base record, then one wouldn't need to alter union definitions.
>>>>> This would work like an alias.  If a record doesn't exist in the
>>>>> reader's schema, then an alias to the missing record would be added in
>>>>> the reader's schema to the base record it names in the writer's
>>>>> schema.  Aliases work by rewriting the writer's schema at read-time,
>>>>> updating names, including those in unions.  Might that work?  It seems
>>>>> like perhaps a more elegant approach.  It has compatible syntax and
>>>>> only alters behavior of a case that fails today.
>>>>>
>>>>> Doug
>>>>
>>>
>>>
>

Reply via email to