Re: [capnproto] Safety of retroactive unionization

Yaron Minsky Sun, 19 May 2019 13:38:29 -0700

On Sun, May 19, 2019 at 3:41 PM Kenton Varda <[email protected]> wrote:
>
> Hi Yaron,
>
> There actually is a compatibility validator library in C++. If you
> load the old and new schemas into the same SchemaLoader object, it
> will throw an exception if they aren't compatible, according to the
> documented compatibility rules.


Yeah, I've heard that mentioned before. I'm curious what notion of
compatibility it checks.  Forwards, backwards, or both?  e.g., does it
flag the lack of forwards compatibility for retroactive unionization?

> However, I don't think this is as useful as people imagine. It
> wouldn't solve the case you describe, because the validator has no
> way of understanding the implications of the "occupation" field
> being ignored. I think this is an AI-complete problem: you need an
> actual understanding of the semantics of each field to fully analyze
> its backwards-compatibility properties.
>
> In practice, application developers still need to think about all of
> the combinations of new and old servers and clients, and how the
> introduction of a new field will affect them. You'll need to design
> an upgrade strategy on a case-by-case basis. What Cap'n Proto (like
> Protobuf, JSON, etc.) provides is some tools so that you can
> potentially design strategies that don't require upfront
> version/schema negotiation, but can instead be handled with a few
> lines of code in the changed method's implementation or caller. But
> like any tool, the developer has to consider how to use it properly
> in each situation.
>
> Now, maybe there is room for a more powerful framework for detecting
> higher-level incompatibilities. For example, maybe in the use case
> you describe, we could imagine an annotation:
>
>     occupation @2: Text = "any" $nonDefaultMustBeKnown;
>
> Then you could develop some sort of protocol, on top of Cap'n Proto,
> that does an upfront exchange of schemas, and detects that the
> "occupation" field is missing from the server's schema. It could
> then check any message sent to the server to see if `occupation` is
> not set to the default, and if so, generate an error. This could all
> be built on top of Cap'n Proto. I'm not aware of any other
> serialization system building something like this, though. It seems
> complex and I'm not sure if it's really worth it.
>
> I prefer instead to do this sort of thing at the application
> layer. For example, you could have a boolean field in the response
> that indicates if the server recognized the `occupation` field, and
> the client could then discard results that it knows to be bad
> because this field is missing. Or you could define a simple
> application-level version number exchange that happens before making
> any calls, and use the version number in the specific places where
> needed to detect these problems. Or, you can make sure to update
> your server before your client. I find the best answer varies from
> case to case.

Makes sense.  It's exactly this kind of design choice that I'm curious
about.

For what it's worth, much of my own experience is using a messaging
system with no built-in cross-version compatibility.  In this system,
you build compatibility between different versions by writing explicit
upgrade and downgrade functions, along with protocols for negotiating
to the best shared version.  Such a system requires the user to be
very explicit about the semantics of interactions across versions,
thus bypassing the AI-complete problem of figuring out how to approach
version changes.  It's not the most efficient thing, and it requires a
decent amount of boilerplate for the conversions.  But the chief
virtue is that the resulting behavior is pretty easy to reason about.

But, the boilerplate issues are enough to make us want to support
capnp-style versioning, which is why we're thinking about all this.
And the application-layer approaches you're describing are similar to
things we're considering, which is comforting.

y

> -Kenton
>
> On Sun, May 19, 2019 at 12:14 PM Yaron Minsky <[email protected]> wrote:
>>
>> Thanks to both of you. That all makes a ton of sense.
>>
>> I'm thinking about the use of capnp in an environment where the
>> systems producing and consuming messages are at least sometimes under
>> enough control to make this kind of thing possible/desirable.
>>
>> There are some cases where the native capnp versioning behavior seems
>> highly congenial, and others where I'm less certain.  If it's not too
>> much of a bore, here's another case I've been thinking of that I'm not
>> sure how to handle.
>>
>> Imagine I have an RPC protocol where the request has this form:
>>
>>   struct ListMatchingPeople {
>>       age @0: Text;
>>       emailDomain @1: Text;
>>   }
>>
>> Here, the implied semantics of the RPC is that it should return a list
>> of all people who match the listed criteria.  Now, let's say I decide
>> that I want to extend the RPC to allow people to also filter by
>> occupation, so I add a new field.
>>
>>   struct ListMatchingPeople {
>>       age @0: Text;
>>       emailDomain @1: Text;
>>       occupation @2: Text = "any";
>>   }
>>
>> Note that this has the nice property that the default value of the
>> field has the same semantics as just omitting the field, so if an old
>> client sends a message to a new RPC server, it will get the behavior
>> that would be expected.
>>
>> The reverse versioning story doesn't work out that well, though. If I
>> send a message from a new client to an old server, then any occupation
>> specified by the old client will be unceremoniously ignored.  You
>> might prefer the behavior of having the new message be rejected when a
>> non-default value for occupation was sent, but I think there's no way
>> to implement that within capnp.
>>
>> Again, I'm curious in practice how people deal with this kind of
>> issue.  Maybe the approach is simply as before to be aware of this
>> kind of problem, and roll the server before you roll clients.
>>
>> You could also imagine some kind of dynamic exchange and validation of
>> schema that could detect this problem in advance, but since there's no
>> schema compatibility validator at present, I imagine no one is doing
>> that...
>>
>> y
>>
>> On Tue, May 14, 2019 at 5:12 PM Kenton Varda <[email protected]> wrote:
>> >
>> > Hi Yaron,
>> >
>> > Ian already answered the question, but I thought I'd add:
>> >
>> > For protocols that are published publicly and used by arbitrary parties 
>> > that you don't control, retroactive unionization may indeed be too unsafe 
>> > to really use.
>> >
>> > Many protocols, though, are used privately between components of a system. 
>> > In this case, forwards- and backwards-compatibility may be important in 
>> > order to allow components to be updated independently, but compatibility 
>> > only needs to extend between all components that are currently in 
>> > production. In that case, it's quite common to do something like:
>> >
>> > 1) Retroactively unionize a field, but don't actually use them new variant 
>> > yet.
>> > 2) Update each component that receives messages of the modified type, so 
>> > that they are aware of the union.
>> > 3) Now, start setting the new variant where desired.
>> >
>> > -Kenton
>> >
>> > On Tue, May 14, 2019 at 1:06 PM Yaron Minsky <[email protected]> 
>> > wrote:
>> >>
>> >> Retroactive unionization is only backwards compatible, not forward
>> >> compatible, right?  So, if I start with this struct:
>> >>
>> >>     struct Person {
>> >>       name @0 :Text;
>> >>       email @1 :Text;
>> >>     }
>> >>
>> >> And decide that I want to evolve it to this one:
>> >>
>> >>     struct Person {
>> >>       name @0 :Text;
>> >>       union {
>> >>         email @1 :Text;
>> >>         age @2 :Float64;
>> >>       }
>> >>     }
>> >>
>> >> (I know it's not a very meaningful example).
>> >>
>> >> If I write something in the new spec that uses the age branch of the
>> >> union, the old struct can try to read it, and get very confused.  In
>> >> particular, if someone tries to read the email for a struct that
>> >> actually populates age, they'll end up reading a Float64 as if it were
>> >> a pointer to a text block.
>> >>
>> >> Am I understanding the issue correctly? If so, how do people handle
>> >> these kinds of protocol changes?  Do people use retroactive
>> >> unionization in practice?  Do people use schema validation of some
>> >> kind to detect when someone makes a potentially unsafe change like
>> >> this one?
>> >>
>> >> y
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google Groups 
>> >> "Cap'n Proto" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send an 
>> >> email to [email protected].
>> >> Visit this group at https://groups.google.com/group/capnproto.
>> >> To view this discussion on the web visit 
>> >> https://groups.google.com/d/msgid/capnproto/CACLX4jRucxh5%2BmrvkkbcvTgmEbxeAcY%2BEJa4XKw5y_-DZGHorQ%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/CACLX4jS%3DFLPYiiFg1GVSVNrkEz7Rs84RbuA11HLq50fNmcX98Q%40mail.gmail.com.

Re: [capnproto] Safety of retroactive unionization

Reply via email to