Re: [swift-evolution] [Review] SE-0155: Normalize Enum Case Representation

Daniel Duan via swift-evolution Sun, 02 Apr 2017 16:31:27 -0700


Daniel Duan
Sent from my iPhone


> On Apr 1, 2017, at 11:49 PM, Xiaodi Wu <[email protected]> wrote:
> 
>> On Sun, Apr 2, 2017 at 1:03 AM, Daniel Duan <[email protected]> wrote:
>> 
>>> On Apr 1, 2017, at 2:54 PM, Xiaodi Wu via swift-evolution 
>>> <[email protected]> wrote:
>>> 
>>>> On Sat, Apr 1, 2017 at 3:38 PM, Daniel Duan <[email protected]> wrote:
>>>> Thanks again for a detailed review. I have a few comments inline.
>>>> 
>>>>>> On Apr 1, 2017, at 9:50 AM, Xiaodi Wu via swift-evolution 
>>>>>> <[email protected]> wrote:
>>>>>> 
>>>>>>  • Does this proposal fit well with the feel and direction of Swift?
>>>>> 
>>>>> The "Pattern consistency" section does not align well with the feel and 
>>>>> direction of Swift. Specifically, it does not explore some of the 
>>>>> difficulties that arise from the proposed rules, adopts some of the same 
>>>>> shortcomings that required revision for SE-0111, and deviates from some 
>>>>> of the anticipated fixes for those shortcomings outlined in the core 
>>>>> team's "update and commentary" to SE-0111.
>>>>> 
>>>>> It is not the case that the design proposed is "a consequence of no 
>>>>> longer relying on tuple patterns," in that it is not the inevitable 
>>>>> result that falls out of that decision. 
>>>> 
>>>> The text in this revision may be poorly phrased. The connection, as I 
>>>> pointed out in an previous thread, is that we need to define syntax for 
>>>> enum pattern matching because the one we’ve been using in Swift 3 is tuple 
>>>> pattern’s syntax, which is now distinct and separate.
>>> 
>>> What I'm saying here is that, although _some_ change becomes necessary, the 
>>> particular changes proposed here are not themselves "a consequence of no 
>>> longer relying on tuple patterns."
>>> 
>>> Put another way, given `enum E { case foo(bar: Int, baz: Int) }`, not being 
>>> allowed to write `switch e { case foo(let a, let b): break }` is *not* an 
>>> inevitable consequence of moving away from tuple patterns. Since the 
>>> particular proposed changes break more existing source code than is 
>>> strictly necessary for moving away from tuple-based pattern matching, those 
>>> choices require stringent justification.
>>> 
>>>>> I will detail the alternative design that requires the fewest deviations 
>>>>> or special rules, and breaks the least code extant today, later on. 
>>>>> First, the shortcomings:
>>>>> 
>>>>> 1.
>>>>> The proposed rules for pattern matching are a source-breaking change, and 
>>>>> are *not* the most minimal such change given the abandoning of tuples 
>>>>> (see alternative below). However, the proposal does not engage with the 
>>>>> core team's Swift 4 criteria for source-breaking changes with respect to 
>>>>> the proposed "stricter rules" for pattern matching. There is no text at 
>>>>> all about why specifically having the compiler encourage local _variable_ 
>>>>> names to match argument labels resolves an active harm that outweighs the 
>>>>> goal of preserving the greatest possible source compatibility.
>>>> 
>>>> With this proposal, user can still use local variable names. It is true 
>>>> that if there are many ways to achieve the same thing, the compiler would 
>>>> be encouraging user to do that thing. But that puts a cost on the 
>>>> compiler, new users and experienced readers in unfamiliar codebases. This 
>>>> is (albeit not to a satisfactory degree, it seems) pointed out in the 
>>>> motivation section. 
>>>> 
>>>> As for source compatibility, Swift 3 code should continue to work with 
>>>> warnings. Swift 4 mode would issue errors along with fix-its, which the 
>>>> migrator can leverage. Depends on core team/community’s implementor 
>>>> resource, there’s even a chance that this change would roll out one 
>>>> version later (warning in 4.X, error in 5.Y). In theory, the migration 
>>>> hurdle can be minimized.
>>> 
>>> Many syntactic changes can be migrated in this way, but for Swift 4, that 
>>> would only be justified when the existing syntax meets a high bar for being 
>>> harmful. Again, the overarching theme of my response is that I don't think 
>>> the proposed "stricter rules" offer much more harm mitigation than 
>>> significantly less source-breaking designs for pattern matching, and I 
>>> don't see anything in the proposal text that discusses the issue or 
>>> justifies the particular design over less source-breaking alternatives.
>>> 
>>>>> 
>>>>> OTOH, the proposal does outline a major use case for a local variable 
>>>>> name that does not match the argument label: `param` vs `parameter`. 
>>>>> Widely-respected style guides in various languages encourage 
>>>>> unabbreviated and descriptive API names but much more concise local 
>>>>> variable names. This is a legitimate and good practice being actively 
>>>>> discouraged by the sugared rules.
>>>> 
>>>> This not a counterpoint, but I personally think using shortened names is 
>>>> not something to be encouraged. A (admittedly quirky) practice some of us 
>>>> inherited from the Cocoa style guideline is to use real, complete words 
>>>> for variable names. I’d like to think that The Swift API Design Guidelines 
>>>> are aligned in spirit on this matter - “clarity is more important than 
>>>> brevity”. (incidentally, the guidelines’s code samples don’t contain 
>>>> partial-word variables anywhere).
>>> 
>>> We're talking _local_ variables: local variables aren't API. There are 
>>> many, many examples of single-letter variables in the design guidelines. 
>>> For example, `x = y.union(z)` has three of them.
>>> 
>>>> 
>>>>> 
>>>>> This would be merely annoying and not harmful if we could guarantee that 
>>>>> it only means the API user will have to use longer local names, but the 
>>>>> natural impulse on the part of thoughtful API authors would be to limit 
>>>>> the expressiveness of their labels to help out their users.
>>>>> 
>>>>> This puts API authors in an impossible bind: they need to choose labels 
>>>>> that are not too short lest it collide frequently with existing local 
>>>>> variable names (`x` and `y` would be suboptimal, for example, but there 
>>>>> are good reasons why an associated value might have arguments labeled `x` 
>>>>> and `y`), 
>>>> 
>>>> API authors are already in this impossible bind: whenever they export a 
>>>> type name, a method signature in an open class or a protocol, risk of 
>>>> collision come up.
>>> 
>>> Again, local variables aren't API. API authors have never been in this bind 
>>> with respect to local variables. Nothing in the language has ever caused 
>>> API to restrict the consumer's choice of local variable names. I think this 
>>> is a highly, highly unusual rule.
>>>  
>> 
>> Local variable being the same as argument label, which is API, correct? I’m 
>> saying user’s local variables and types can collide with symbols from APIs. 
>> To illustrate, imagine implementing a protocol (yes, as simple as that):
>> 
>> protocol A {
>>     var answer: Int { get }
>>     func ask(_ question: String)
>> }
>> 
>> if blah() {
>>     let answer = 0
>>     let question = "huh?"
>> 
>>     class B: A {
>>         let answer = 42
>>         func ask(_ question: String) {
>>             // what's question and answer here?
>>          // what if you want to define a new type here with the name “A”?
>>         }
>>     }
>> }
>> 
>> `A` forced user to shadow their local variable (a collision!), so it’s wise 
>> for the user to pick some other variable name here. Why does seem so trivial 
>> and natural? Because it’s how API works: someone defines some symbol, you 
>> take them into your local scope and use them. The pattern matching rule 
>> proposed here is no different.
>> 
>>>> When a local variable does collide with a payload label, it would be bad 
>>>> if the user accidentally used the variable _in stead of_ the actual 
>>>> payload value. Forcing users to proactively rebind the variable would make 
>>>> them more mindful for this type of mistake.
>>> 
>>> What mistake do you have in mind? Currently, labels have nothing to do with 
>>> variable names. How does a user accidentally use a label name instead of a 
>>> variable name?
>> 
>> Looking at definition of an enum, user sees something like
>> 
>> enum SomeEnum {
>>     case aCase(veryMundaneName: Type) // substitute “veryMundaneName” with a 
>> common label, like “value” or “account"
>> }
>> 
>> What’s the value of `veryMundaneName` in a pattern matched black for 
>> `aCase`? The answer in Swift 3 is: no one knows! User may use this variable 
>> expecting it’s bind to the associated value because it’s natural given the 
>> context, and later find out that they’ve been using a variable from outside 
>> because the associated value is bond to something completely unrelated. 
>> 
>> 
>> Example: 
>> 
>> switch enumValue {
>> case aCase….:
>>   // many lines of code later…
>>   doThings(with: veryMundaneName) // bug!
>> }
>> 
>> Turns out, the bug is due to
>> 
>> let veryMundaneName: AType = getAMundaneValue()
>> // many lines of code later
>> switch enumValue {
>> case aCase(let randomLabelFreedomYay):
>>   // many lines of code later
>>   doThings(with: veryMundaneName) // bug!
>> }
>> 
>> This mistake seems silly, and is still a problem in the case of rebinding. 
>> But we can make it happen less.
> 
> I am not convinced this is an illustration of a bug related to enum cases in 
> any real sense. You are invoking a function with one variable when you meant 
> to invoke it with another. This can happen with any two variables in any 
> scenario. I see no evidence that argument labels are any more prone to be 
> confused for variable names than are case names, function names, or any other 
> name. It is that Swift is strongly typed that makes confusion happen less, 
> given that `veryMundaneName` is of type `AType` and `randomLabelFreedomYay` 
> is of type `Type`.
>  
>>>> 
>>>>> but they also need to choose labels that are not too verbose. The safest 
>>>>> bet in this case would be not to label at all, but then they lose the 
>>>>> communicative aspect of argument labels (see point 2 below).
>>>> 
>>>> A more realistic version of the story: API author choose labels that make 
>>>> the most sense for the declaration and user accept the risk of collision 
>>>> as they use the API. Most of those who choose to skip labels would not 
>>>> have given this much thought about their effect at all.
>>>> 
>>>>> 
>>>>> 2.
>>>>> In the "update and commentary" revising SE-0111, it was acknowledged that 
>>>>> "cosmetic" labels have a significant use case. Thus, the rules were 
>>>>> changed to allow `(_ foo: Int, _ bar: Int) -> ()` to communicate to the 
>>>>> reader of code that the first argument serves some purpose "foo" without 
>>>>> forcing that name to be part of the API, pending further revisions.
>>>>> 
>>>>> Because enum cases are currently tuples, labels can be dropped freely, 
>>>>> and therefore these labels are effectively "optional" parts of the API 
>>>>> that can be seen by the user but, at their discretion, not used. That 
>>>>> fulfills the use case of "cosmetic" labels. In this revised proposal, by 
>>>>> requiring the argument label to be actually _written_ somewhere by the 
>>>>> API user, it puts a dent into the legitimate use case of "cosmetic" 
>>>>> labels.
>>>>> 
>>>>> That is to say, an API author who wishes to communicate something about a 
>>>>> parameter by using a label must now also consider if that label is also 
>>>>> appropriate as a variable name and must forgo its use if the label is not 
>>>>> so appropriate. This is a very different decision-making process and it 
>>>>> is being applied retroactively to previously designed APIs whose labels 
>>>>> would have been (hopefully thoughtfully) chosen under very different 
>>>>> circumstances.
>>>> 
>>>> This is something we never agreed on: SE-0111 is about functions. In some 
>>>> languages, patterns does resemble constructor functions, but that’s as 
>>>> much similarity as one can get anywhere. I still think applying every 
>>>> decision we made about functions to pattern matching is weird.
>>> 
>>> I have to admit, I still don't understand your reticence. The first part of 
>>> your proposal aligns enum cases with functions. If we are to look for 
>>> patterns in something that is spelled like a function, then it is natural 
>>> for the pattern itself to be spelled like a function, no? Currently, in 
>>> Swift 3, since we're trying to use pattern matching for a tuple, the 
>>> pattern is spelled like a tuple. In my simplistic mind, if we're trying to 
>>> use pattern matching for a $foo, the pattern should be spelled like a $foo. 
>>> Far from being weird, to me that is the only possible intuitive syntax.
>>> 
>>>> But here’s my analysis anyways: the “cosmetic label” comment is about 
>>>> paving a way to restore expressivity of closures. It talks about the 
>>>> *interaction* between a function/closure’s declaration and use site — if 
>>>> parameter names are provided in a closure’s declaration, they should be 
>>>> required at invocation, similar to pre-SE-0111. IMO this proposal makes 
>>>> enum case and patterns closer to this goal.
>>> 
>>> I agree that your proposal does indeed get us closer to SE-0111. By 
>>> requiring argument labels chosen by the API author to be written out by the 
>>> user, we get closer to the goals of SE-0111. But SE-0111 also had a large 
>>> drawback that required post-approval modification, which was that there 
>>> ended up being no way to write "cosmetic labels," which both the community 
>>> and core team agreed was an important use case.
>>> 
>>> With functions, that role can be filled with internal parameter names. This 
>>> is what the "update and commentary" restored to SE-0111. With tuples, that 
>>> role is filled by the labels themselves, because they can be ergonomically 
>>> erased. With enum cases, you have not provided a parallel facility for 
>>> cosmetic labels, because in your proposal labels can no longer be easily 
>>> erased, but nor are there internal parameter names or some other 
>>> substitute. I'm saying that we should learn from the problems discovered 
>>> after SE-0111 was approved and fix that shortcoming for enum cases before 
>>> this proposal is adopted. 
>> 
>> (Link to what we are talking about for the benefit of those reading along: 
>> https://lists.swift.org/pipermail/swift-evolution-announce/2016-July/000233.html)
>> 
>> The key distinction we need to decide here is whether case labels are 
>> “cosmetic”. We don’t allow declaration of separate parameter name and 
>> internal name for associated values. I interpret that as we are enforcing 
>> the syntax sugar in function declaration where user can use one symbol to 
>> represent both:
>> 
>> func f(x: Int) // is the same as func f(x x: Int)
>> 
>> It’s tempting to treat matching an enum value against a pattern as assigning 
>> a function value to a variable.
> 
> Sorry, I am not sure I understand this sentence.

Aka viewing the case pattern as simply an compound variable assignment as 
envisioned in the SE-0111 commentary. This way of the labels would be 
"cosmetic".

>  
>> If that’s what we are doing, it makes perfect sense to say we get “ultimate 
>> glory” here with patterns. Meaning, as you suggested, we consider the case 
>> labels “cosmetic”. It’s really just tho parameter name in a function (the 
>> first of the two “x” in code comment above.
>> 
>> But that’s kind of a stretch isn’t it? An enum value is very different 
>> compared to a function value. Yes, there happen to be a function that 
>> constructs this enum value that’s declared when user declare a case, that 
>> function gets as much resemblance as any other. But the enum value it self 
>> deserves more consideration. Telling the user “do these things that you do 
>> with a function value” just makes pattern matching harder to explain, 
>> because we are *not* assigning nor invoking function values.
> 
> Ah, I see. You think of the associated value as something distinct from the 
> declaration used to initialize it. However, there is no spelling for an 
> associated value other than what is used to initialize it. Given `case 
> foo(bar: Int, baz: Int, boo: Int)`, previously, the full name of the case was 
> `foo` and the associated value was `(bar: Int, baz: Int, boo: Int)`. Your 
> proposal causes the full name of the case instead to be `foo(bar:baz:boo:)` 
> and the associated value to be `(Int, Int, Int)`. Is that not your 
> understanding of it?

Yes

> Pattern matching is just a matter of (a) indicating what case you want to 
> identify with the pattern; and (b) what parts of the associated value you 
> wish to match or to bind to variables. Part (a) is done by writing the name, 
> either the base name or in full (i.e. either `foo` or `foo(bar:baz:boo:)`). 
> Part (b) is done by writing `let myVariableName` in the intended positions.

What I left out is that the internal/parameter names of a function are 
non-optional part of its signature (one must use exact parameter names to 
implement a method in a protocol, for example). I prefer treating labels in 
case pattern matching the same way we treat parameter names in protocol method 
implementation (due to  the symmetry between constructing/deconstructing body 
mentioned in my previous comments).

>> That’s not to say we need totally distinct syntax. Deconstructing a value 
>> should visually relate to constructing it. So here’s how I think these two 
>> relate: a constructor is a function. Function signature has these arguments 
>> that the function refers to in its body. Pattern matching is the starting 
>> point of deconstructing a value. The scope created following it is the 
>> equivalent of a “body”, in which the associated values are used as 
>> “arguments”. Therefore it make sense to say that these labels are more like 
>> internal names (the 2nd “x” in the comment of the above sample).
>> 
>>>>> 3.
>>>>> The first part of the proposal aligns enum case syntax with functions. 
>>>>> Functions often taken prepositions as argument labels, and indeed 
>>>>> previous SE proposals have extended the rules to allow most words. 
>>>>> However, `case foo(index: Int, in: T)` would have a disastrous label, as 
>>>>> `in` would be a very annoying variable name whose use would be actively 
>>>>> encouraged by the proposed sugared pattern matching rules.
>>>>> 
>>>>> The proposed rules for the sugared pattern would also require (well, 
>>>>> greatly encourage) unique labels for each argument. This again is 
>>>>> inconsistent with the naming conventions encouraged by the first part of 
>>>>> the proposal aligning enum case syntax with functions, which have no such 
>>>>> restrictions. If a user names something `case foo(point: T, point: T)`, 
>>>>> then the matching rules would actively encourage an invalid redefinition 
>>>>> of a variable named `point`.
>>>>> 
>>>>> (On the other hand, the API author does not have the luxury of naming the 
>>>>> same case `foo(from point: T, to point: T)`, and even if they did, 
>>>>> prepositions can make lousy local variable names--see first paragraph.)
>>>> 
>>>> I don’t see this as a problem for enum case authors. It just means the 
>>>> poor pattern writer needs to provide the positional information to 
>>>> disambiguate.
>>> 
>>>  What do you mean by "positional information" here?
>>> 
>>>>> 4.
>>>>> The proposal does not explore what happens when the proposed prohibition 
>>>>> on "mixing and matching" the proposed sugared and unsugared pattern 
>>>>> matching runs up against associated values that have a mix of labeled and 
>>>>> unlabeled parameters, and pattern matching user cases where the user does 
>>>>> not wish to bind all of the arguments.
>>>>> 
>>>>> Given `case foo(a: Int, String, b: Int, String)`, the only sensible 
>>>>> interpretation of the rules for sugared syntax would allow the user to 
>>>>> choose any name for some but not all of the labels. If the user wishes to 
>>>>> bind only `b`, however, he or she will need to navigate a puzzling set of 
>>>>> rules that are not spelled out in the proposal:
>>>>> 
>>>>> ```
>>>>> case foo(a: _, _, b: let b, _)
>>>>> // this is definitely allowed
>>>>> 
>>>>> case foo(a: _, _, b: let myVar, _)
>>>>> // this is also definitely allowed
>>>>> 
>>>>> // but...
>>>>> case foo(_, _, b: let myVar, _)
>>>>> // is this allowed, or must the user explicitly state and not bind `a`?
>>>>> 
>>>>> // ...and with respect to the sugared version...
>>>>> case foo(_, _, let b, _)
>>>>> // is this allowed, or must the user explicitly state and not bind `a`?
>>>>> ```
>>>>> 
>>>> 
>>>> Good point. To make up for this: `_` can substitute any sub pattern, which 
>>>> is something that this proposal doesn’t change but definitely worth 
>>>> spelling out.  
>>>> 
>>>>> 5.
>>>>> In the "update and commentary" revising SE-0111, the core team outlined a 
>>>>> preferred path to restoring the full use of argument labels for functions 
>>>>> without giving them type system significance. They gave a non-sugared 
>>>>> form and a sugared form, both of which have met with approval from the 
>>>>> community.
>>>>> 
>>>>> Briefly, the non-sugared form allows compound names to be used in 
>>>>> variable names: `func foo(opToUse op(lhs:rhs:) : (Int, Int) -> Int)`. The 
>>>>> first part of this proposal is consistent in that it removes the type 
>>>>> system significance of argument labels from the associated values of enum 
>>>>> cases, and considers them as part of the enum case name. It also stands 
>>>>> to reason that, if a user were to match a case _without_ trying to bind 
>>>>> any variables, the same syntax would have be used if the base name is 
>>>>> ambiguous: `case elet(locals:body:): break`.
>>>>> 
>>>>> However, the proposal makes no provision for using that same compound 
>>>>> name in pattern matching. There appears to be no particular reason for 
>>>>> its isolated omission here, as `case elet(locals:body:)(let a, let b): 
>>>>> return a * b` is readable and presents no syntactic difficulties. 
>>>>> (Moreover, it is consistent with the syntax permitted in this proposal 
>>>>> for initializing a variable: `let foo = Expr.elet(locals:body:)([], 
>>>>> anExpr)`.)
>>>> 
>>>> Another good point. We can handle this in the purely additional proposal 
>>>> for compound variable names. I consider this not the 5th item in the list, 
>>>> but a separate suggestion, however :P
>>>> 
>>>>> 
>>>>> --- 
>>>>> 
>>>>> In light of these shortcomings, I would argue that the following 
>>>>> alternative scheme is the most intuitive and consistent for pattern 
>>>>> matching given the general agreement that enum case representation should 
>>>>> be "normalized":
>>>>> 
>>>>> Given:
>>>>> 
>>>>> ```
>>>>> enum S {
>>>>>   case foo(bar: Int, baz: Int)
>>>>>   case foo(boo: String)
>>>>>   case bar(boo: String)
>>>>> }
>>>>> ```
>>>>> 
>>>>> a. As in functions after SE-0111, enum cases can be identified 
>>>>> unambiguously, regardless of whether one is initializing a variable or 
>>>>> matching a case, by their compound name, e.g. `bar(boo:)`. Where a case 
>>>>> can be unambiguously identified with only the base name, that is an 
>>>>> alternative spelling, e.g. `bar`. Where a case cannot be identified 
>>>>> uniquely with the base name, then it is an error to try to use the base 
>>>>> name alone: `case foo: break // error: unambiguous`.
>>>>> 
>>>>> b. As in functions after SE-0111, arguments can be passed in either a 
>>>>> sugared form or an unsugared form, and they can be bound in a pattern 
>>>>> matching statement in the same way. That is, `case foo(bar: let a, baz: 
>>>>> let b): break` and `case foo(bar:baz:)(let a, let b): break` are 
>>>>> equivalent.
>>>>> 
>>>>> c. As in functions, one cannot supply different or incorrect argument 
>>>>> labels. That is, `case foo(baz: let a, bar: let b)` and `case 
>>>>> foo(baz:bar:)(let a, let b)` are both forbidden. _This recovers the vast 
>>>>> majority of the additional syntactic safety that is outlined in the 
>>>>> revised proposal, but without the use of any special rules for pattern 
>>>>> matching._
>>>>> 
>>>>> d. By composing rules (a) and (b), `case bar(let a)` is allowed as it is 
>>>>> today, preserving source compatibility. However `case foo(let b, let c)` 
>>>>> is not allowed, and _not_ because different local variable names are 
>>>>> chosen, but because the enum has two cases named foo.
>>>> 
>>>> From a user’s point of view, there’s enough positional information in this 
>>>> pattern for the compiler to figure out which case it should match. This 
>>>> would be very unintuitive IMO.
>>> 
>>> Wait, the key point of your proposal, with its "stricter rules," is that 
>>> labels shouldn't be optional even with sufficient positional information! 
>>> That's also the whole thing above about getting us closer to aligning with 
>>> SE-0111, etc.
>> 
>> Fair enough. The argument I invoked leads us to a dark path :P
>> 
>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected]
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Review] SE-0155: Normalize Enum Case Representation

Reply via email to