Re: [swift-evolution] Enums and Source Compatibility

David Sweeris via swift-evolution Wed, 09 Aug 2017 11:55:32 -0700

> On Aug 9, 2017, at 11:04, Matthew Johnson <[email protected]> wrote:
> 
>> On Aug 9, 2017, at 12:15 PM, Tony Allevato via swift-evolution 
>> <[email protected]> wrote:
>> 
>>> On Wed, Aug 9, 2017 at 9:40 AM David Sweeris via swift-evolution 
>>> <[email protected]> wrote:
>>> (Now with more mailing lists in the "to" field!)
>>>> On Aug 8, 2017, at 3:27 PM, Jordan Rose via swift-evolution 
>>>> <[email protected]> wrote:
>>>> 
>>>> Hi, everyone. Now that Swift 5 is starting up, I'd like to circle back to 
>>>> an issue that's been around for a while: the source compatibility of 
>>>> enums. Today, it's an error to switch over an enum without handling all 
>>>> the cases, but this breaks down in a number of ways:
>>>> 
>>>> - A C enum may have "private cases" that aren't defined inside the 
>>>> original enum declaration, and there's no way to detect these in a switch 
>>>> without dropping down to the rawValue.
>>>> - For the same reason, the compiler-synthesized 'init(rawValue:)' on an 
>>>> imported enum never produces 'nil', because who knows how anyone's using C 
>>>> enums anyway?
>>>> - Adding a new case to a Swift enum in a library breaks any client code 
>>>> that was trying to switch over it.
>>>> 
>>>> (This list might sound familiar, and that's because it's from a message of 
>>>> mine on a thread started by Matthew Johnson back in February called 
>>>> "[Pitch] consistent public access modifiers". Most of the rest of this 
>>>> email is going to go the same way, because we still need to make progress 
>>>> here.)
>>>> 
>>>> At the same time, we really like our exhaustive switches, especially over 
>>>> enums we define ourselves. And there's a performance side to this whole 
>>>> thing too; if all cases of an enum are known, it can be passed around much 
>>>> more efficiently than if it might suddenly grow a new case containing a 
>>>> struct with 5000 Strings in it.
>>>> 
>>>> 
>>>> Behavior
>>>> 
>>>> I think there's certain behavior that is probably not terribly 
>>>> controversial:
>>>> 
>>>> - When enums are imported from Apple frameworks, they should always 
>>>> require a default case, except for a few exceptions like NSRectEdge. (It's 
>>>> Apple's job to handle this and get it right, but if we get it wrong with 
>>>> an imported enum there's still the workaround of dropping down to the raw 
>>>> value.)
>>>> - When I define Swift enums in the current framework, there's obviously no 
>>>> compatibility issues; we should allow exhaustive switches.
>>>> 
>>>> Everything else falls somewhere in the middle, both for enums defined in 
>>>> Objective-C:
>>>> 
>>>> - If I define an Objective-C enum in the current framework, should it 
>>>> allow exhaustive switching, because there are no compatibility issues, or 
>>>> not, because there could still be private cases defined in a .m file?
>>>> - If there's an Objective-C enum in another framework (that I built 
>>>> locally with Xcode, Carthage, CocoaPods, SwiftPM, etc.), should it allow 
>>>> exhaustive switching, because there are no binary compatibility issues, or 
>>>> not, because there may be source compatibility issues? We'd really like 
>>>> adding a new enum case to not be a breaking change even at the source 
>>>> level.
>>>> - If there's an Objective-C enum coming in through a bridging header, 
>>>> should it allow exhaustive switching, because I might have defined it 
>>>> myself, or not, because it might be non-modular content I've used the 
>>>> bridging header to import?
>>>> 
>>>> And in Swift:
>>>> 
>>>> - If there's a Swift enum in another framework I built locally, should it 
>>>> allow exhaustive switching, because there are no binary compatibility 
>>>> issues, or not, because there may be source compatibility issues? Again, 
>>>> we'd really like adding a new enum case to not be a breaking change even 
>>>> at the source level.
>>>> 
>>>> Let's now flip this to the other side of the equation. I've been talking 
>>>> about us disallowing exhaustive switching, i.e. "if the enum might grow 
>>>> new cases you must have a 'default' in a switch". In previous (in-person) 
>>>> discussions about this feature, it's been pointed out that the code in an 
>>>> otherwise-fully-covered switch is, by definition, unreachable, and 
>>>> therefore untestable. This also isn't a desirable situation to be in, but 
>>>> it's mitigated somewhat by the fact that there probably aren't many 
>>>> framework enums you should exhaustively switch over anyway. (Think about 
>>>> Apple's frameworks again.) I don't have a great answer, though.
>>>> 
>>>> For people who like exhaustive switches, we thought about adding a new 
>>>> kind of 'default'—let's call it 'unknownCase' just to be able to talk 
>>>> about it. This lets you get warnings when you update to a new SDK, but is 
>>>> even more likely to be untested code. We didn't think this was worth the 
>>>> complexity.
>>>> 
>>>> 
>>>> Terminology
>>>> 
>>>> The "Library Evolution" doc (mostly written by me) originally called these 
>>>> "open" and "closed" enums ("requires a default" and "allows exhaustive 
>>>> switching", respectively), but this predated the use of 'open' to describe 
>>>> classes and class members. Matthew's original thread did suggest using 
>>>> 'open' for enums as well, but I argued against that, for a few reasons:
>>>> 
>>>> - For classes, "open" and "non-open" restrict what the client can do. For 
>>>> enums, it's more about providing the client with additional guarantees—and 
>>>> "non-open" is the one with more guarantees.
>>>> - The "safe" default is backwards: a merely-public class can be made 
>>>> 'open', while an 'open' class cannot be made non-open. Conversely, an 
>>>> "open" enum can be made "closed" (making default cases unnecessary), but a 
>>>> "closed" enum cannot be made "open".
>>>> 
>>>> That said, Clang now has an 'enum_extensibility' attribute that does take 
>>>> 'open' or 'closed' as an argument.
>>>> 
>>>> On Matthew's thread, a few other possible names came up, though mostly 
>>>> only for the "closed" case:
>>>> 
>>>> - 'final': has the right meaning abstractly, but again it behaves 
>>>> differently than 'final' on a class, which is a restriction on code 
>>>> elsewhere in the same module.
>>>> - 'locked': reasonable, but not a standard term, and could get confused 
>>>> with the concurrency concept
>>>> - 'exhaustive': matches how we've been explaining it (with an "exhaustive 
>>>> switch"), but it's not exactly the enum that's exhaustive, and it's a long 
>>>> keyword to actually write in source.
>>>> 
>>>> - 'extensible': matches the Clang attribute, but also long
>>>> 
>>>> 
>>>> I don't have better names than "open" and "closed", so I'll continue using 
>>>> them below even though I avoided them above. But I would really like to 
>>>> find some.
>>>> 
>>>> 
>>>> Proposal
>>>> 
>>>> Just to have something to work off of, I propose the following:
>>>> 
>>>> 1. All enums (NS_ENUMs) imported from Objective-C are "open" unless they 
>>>> are declared "non-open" in some way (likely using the enum_extensibility 
>>>> attribute mentioned above).
>>>> 2. All public Swift enums in modules compiled "with resilience" (still to 
>>>> be designed) have the option to be either "open" or "closed". This only 
>>>> applies to libraries not distributed with an app, where binary 
>>>> compatibility is a concern.
>>>> 3. All public Swift enums in modules compiled from source have the option 
>>>> to be either "open" or "closed".
>>>> 4. In Swift 5 mode, a public enum should be required to declare if it is 
>>>> "open" or "closed", so that it's a conscious decision on the part of the 
>>>> library author. (I'm assuming we'll have a "Swift 4 compatibility mode" 
>>>> next year that would leave unannotated enums as "closed".)
>>>> 5. None of this affects non-public enums.
>>>> 
>>>> (4) is the controversial one, I expect. "Open" enums are by far the common 
>>>> case in Apple's frameworks, but that may be less true in Swift.
>>>> 
>>>> 
>>>> Why now?
>>>> 
>>>> Source compatibility was a big issue in Swift 4, and will continue to be 
>>>> an important requirement going into Swift 5. But this also has an impact 
>>>> on the ABI: if an enum is "closed", it can be accessed more efficiently by 
>>>> a client. We don't have to do this before ABI stability—we could access 
>>>> all enums the slow way if the library cares about binary compatibility, 
>>>> and add another attribute for this distinction later—but it would be nice™ 
>>>> (an easy model for developers to understand) if "open" vs. "closed" was 
>>>> also the primary distinction between "indirect access" vs. "direct access".
>>>> 
>>>> I've written quite enough at this point. Looking forward to feedback!
>>> 
>>> How does this compare with the other idea (I can't remember who posted it) 
>>> of allowing enum "subtyping"?
>>> enum Foo {
>>>   case one
>>>   case two
>>> }
>>> enum Bar : Foo {
>>>   // implicitly has Foo's cases, too
>>>   case three
>>> }
>>> 
>>> That way, if you switch over a `Foo`, you'll only ever have two cases to 
>>> worry about. Code that needs to handle all three cases would need to switch 
>>> over a `Bar`, but could also switch over a `Foo` since its cases are a 
>>> subset of Bar's cases.
>> 
>> It's worth noting here that Foo is a subtype of Bar, not the other way 
>> around (which is implied by the syntax), because while it is the case that 
>> every instance of Foo is also a Bar, not every instance of Bar is also a Foo.
>> 
>> So, the interesting thing about enums is that if you allow this kind of 
>> syntax, it means they can retroactively gain *supertypes*; I don't know 
>> enough about type theory to know whether that would be a problem or not. 
>> (Maybe it's not much different than retroactive protocol conformance?)
>> 
>> Something like this definitely feels useful for cleanly migrating users away 
>> from an old enum to a new one, but we may still struggle with some of the 
>> classic covariance problems:
>> 
>> enum Foo {
>>   case one
>>   case two
>> }
>> // I'm not recommending this syntax, just writing it differently to avoid 
>> the subtyping confusion stemming from overloading the colon
>> enum NewFoo including Foo {
>>   case three
>> }
> 
> I agree with your observations regarding syntax that matches class 
> inheritance or protocol conformance.  The syntax I have played with in the 
> past looks like this:
> 
> enum NewFoo {
>   cases Foo
>   case three
> }
> 
> This syntax has the advantage of placing all case declarations side by side, 
> including the embedded cases.


Yeah, that's probably better... Should there be a colon, "cases: Foo", to makes 
it a bit harder to typo your way to adding a ton of cases?


> It is also very similar to the closest workaround we have today (although 
> without a formal subtype relationship):
> 
> enum NewFoo {
>   case foo(Foo)
>   case three
> 
>   // also a static var or func for each case of Foo used to create values
> }

Yeah, I was thinking about this earlier. "Extending" enums could actually be 
implemented like this behind the scenes (with tons of sugar to flatten out 
switches), but I think it'd be better to go ahead and synthesize the whole 
thing, since it seems like it'd be better to go into IRGen with one giant 
switch instead of a bunch of nested switches (I know compilers sometimes do 
this as an optimization, but the way they break it up might not be remotely 
close to the source code breaks them up).

- Dave Sweeris

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] Enums and Source Compatibility

Reply via email to