Hi Jens,
Thanks for the detailed reply. You bring up some good points. On the blog
posting difficulty, I'm talking to someone else on our team about whether we
can make commenting on the blog easier. Sorry for the speedbump there. If we
can fix it and you're willing to try reposting the comment there, I'll let you
know if / when that's easier to do.
Re: the idea of upgrading the consumer (the civil defense system) before the
producer (the ocean sensor), I agree that's a workable approach (and one of the
alternatives stated in the original post). But it does require detecting as
part of your deployment process that this kind of backward-incompatible change
has been made and choosing a more restricted deployment process than otherwise
(if you're making backwards-compatible changes, you retain more flexibility in
your deployment.) And that kind of detection of an incompatible change --
particularly to enums -- is not easy to do in an automated way (and we're big
believers in CI and automated code pushes).
Re: the server having more intelligent handling, please let me know what you
think of the detailed explanation below for why this may not be straightforward
(tl;dr: the decoding of interest is happening in generated code over which a
developer has little control).
I think soft versioning is a great feature, and the alternative you suggested
of using a struct with optional fields is another alternative to the ones in
the blog post. I'd propose it's similar to the idea of constants in a namespace
mentioned in the original post -- you're indicating that there are at least
these options but that more may be added later. It might also be more closely
modeled as a union of booleans than a struct of integers -- the intention is to
actually represent the type of a single sighting and not the number of
occurrences of a monster. (These enumerated types would likely be embedded in a
message that includes information about the count of occurrences.) The problem
specifically with enums is that part of the semantics of an enum is that the
/only/ values that are allowed are those in specified in the enum /and/ that
one and only one of them is set for any (required) instance of an enum (again,
closer to a union of boolean fields than a struct of optional fields in which
all could be set simultaneously). You can give up this semantic guarantee, but
you'd be asking for something different than an enum, and enums actually do
have a lot of benefits when it comes to client programming (precisely because
you can rely on those guarantees of at most one value being set and it being
drawn from the small and finite set of valid values that you know about.)
So I definitely agree that there are workarounds if you're willing to give up
some of the very specific semantic guarantees of an enum. But the discussion
I'm hoping to encourage is that /if/ you decide to use an enum for its
convenient semantics at a given point in time, that may have unanticipated
consequences when it comes to versioning and deployment and your team should at
least consider those potential consequences.
I'm pretty sure you can see this when looking at generated code for a struct
containing a required enum field, for example:
/**
* V1
*/
enum MyRecordType {
TYPE_1 = 0,
TYPE_2 = 1
}
struct MyRecord {
1: required MyRecordType recordType,
2: optional i32 occurrences
}
Very specifically using the Java 0.9 compiler, the generated IScheme for
MyRecord will have a read() method that looks in part like this:
case 1: // TYPE
if (schemeField.type == org.apache.thrift.protocol.TType.I32) {
struct.recordType = MyRecordType.findByValue(iprot.readI32());
struct.setRecordTypeIsSet(true);
} else {
org.apache.thrift.protocol.TProtocolUtil.skip(iprot,
schemeField.type);
}
break;
If you look at the generated code for MyRecordType.findByValue, you'll find a
switch with a bunch of cases and then a default clause that returns null.
So if MyRecordType evolves:
/**
* V2
*/
enum MyRecordType {
TYPE_1 = 0,
TYPE_2 = 1,
TYPE_3 = 2
}
And a newer producer writes a record with that new enum value (TYPE_3) read by
an older client, then the older client will invoke its default clause in its
findByValue method and return null. Popping back up to the read() method in the
generated IScheme, I'm pretty sure that will result in a null value being
assigned to struct.recordType while /at the same time/ setting
struct.setRecordTypeIsSet to true. In other words, you'd wind up with a
required field that is set according to isSetRecordType is set but whose actual
value is null. I'm not saying this is a wrong choice necessarily (though I'm
pretty sure the similar situation with protobufs will throw an exception
instead). I'm saying that there is no well-defined way to decide what that
value should be in the older client. This is a backwards-incompatible change as
a result. There's no way for thrift generically to decide what to do in this
case. (Should it really coerce the GODZILLA type to the YETI type behind the
scenes?) And because the decoding is happening in generated code, developers
have limited if any control, so I think the suggestion you make about the
server handling it better is a good one that may not be available without
(shudder) modifying generated code.
-----
From: "Jens Geyer" <[email protected]>
To: <[email protected]>
References: <[email protected]>
In-Reply-To: <[email protected]>
Subject: Re: Blog post on schema evolution and enums
Date: Wed, 13 Nov 2013 21:18:05 +0100
Hi Josh,
Sorry that I do not reply at your post, but the page asked me to log in
before I could comment, which I refused to do.
I'm not sure if I agree with you. Soft versioning is a feature, IMHO. The
picture you draw there could be drawn in a similar way, essentially to the
same effect. Let's assume the following struct, which could transport
multiple events at once (notice the
struct DangerousCreaturesEvent {
1: optional i32 FRANKENSTEINS_MONSTERs
2: optional i32 YETIs
3: optional i32 LOCH_NESS_MONSTERs
}
Now I release a new, upgraded version of my ocean sensors, which can also
report Godzillas
struct DangerousCreaturesEvent {
1: optional i32 FRANKENSTEINS_MONSTERs
2: optional i32 YETIs
3: optional i32 LOCH_NESS_MONSTERs
4: optional i32 GODZILLAs
}
We could also do something similar with functions in the service interface.
So we would have the same problem here and there, right?
Yes, and no. The point is NOT, that we have a different type now. Because
that's fully expected, due to the ability to do soft versioning.
The conclusion would be however, if your Tokyo civil defense system
absolutely must be able to detect Godzillas at the time, the new version
becomes available (the quicker the better, because these Godzillas tend to
hit you when you expect them at least) then you should ship your upgrade to
server in Tokyo first, before you put the updates out to the clients at the
coast and out there in the ocean.
FInally, we have to consider that before you finished the update and shipped
it, nobody was there to report any passing Godzillas either, regardless of
enum, struct or whatever else vehicle we used to transport that information.
So it's not really a technical issue, neither seems it specific to enums. It
is merely an organizational thing. In your final notes you wrote, that the
server should drop messages that he could not handle, or thrown an exception
to notify the client about the incompatibility, I absolutely agree. The
important point is, that one should be aware of the perils of soft
versioning and enable the implementation to deal with it in a well-defined
manner.
$0,02
JensG
On Nov 13, 2013, at 10:15 AM, Josh Rosenblum <[email protected]> wrote:
> Hello Thrift-Users,
>
> I've benefited from a lot of third-part documentation on schema design in
> Thrift and from the work that's been put into supporting forward and backward
> compatibility. One edge case that we at Conductor have run into in a couple
> of places involves enums. Unlike a lot of other incompatibilities, it's
> pretty easy to make incompatible schema changes to enums without any of your
> usual tools helping you; in particular the generated code before and after a
> change to allowable enum values will typically not raise compilation errors.
> I've tried to gather up some of our experiences and possible workarounds that
> reduce the chance of unexpected incompatibilities in a blog post. Would
> appreciate any review and comments / corrections. Please shout out with any
> errors you see.
>
> http://bit.ly/1boyizy
>
> Thanks!
>
> -Josh
>