Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Matthew Johnson via swift-evolution Fri, 08 Sep 2017 14:09:11 -0700

> On Sep 8, 2017, at 12:05 PM, Tony Allevato <[email protected]> wrote:
> 
> 
> 
> On Fri, Sep 8, 2017 at 9:44 AM Matthew Johnson <[email protected] 
> <mailto:[email protected]>> wrote:
>> On Sep 8, 2017, at 11:32 AM, Tony Allevato <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> 
>> 
>> On Fri, Sep 8, 2017 at 8:35 AM Matthew Johnson <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> On Sep 8, 2017, at 9:53 AM, Tony Allevato via swift-evolution 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> 
>>> Thanks for bringing this up, Logan! It's something I've been thinking about 
>>> a lot lately after a conversation with some colleagues outside of this 
>>> community. Some of my thoughts:
>>> 
>>> AFAIK, there are two major use cases here: (1) you need the whole 
>>> collection of cases, like in your example, and (2) you just need the number 
>>> of cases. The latter seems to occur somewhat commonly when people want to 
>>> use an enum to define the sections of, say, a UITableView. They just return 
>>> the count from numberOfSections(in:) and then switch over the cases in 
>>> their cell-providing methods.
>>> 
>>> Because of #2, it would be nice to avoid instantiating the collection 
>>> eagerly. (Also because of examples like Jonathan's, where the enum is 
>>> large.) If all the user is ever really doing is iterating over them, 
>>> there's no need to keep the entire collection in memory. This leads us to 
>>> look at Sequence; we could use something like AnySequence to keep the 
>>> current case as our state and a transition function to advance to the next 
>>> one. If a user needs to instantiate the full array from that sequence they 
>>> can do so, but they have to do it explicitly.
>>> 
>>> The catch is that Sequence only provides `underestimatedCount`, rather than 
>>> `count`. Calling the former would be an awkward API (why is it 
>>> underestimated? we know how many cases there are). I suppose we could 
>>> create a concrete wrapper for Sequence (PrecountedSequence?) that provides 
>>> a `count` property to make that cleaner, and then have 
>>> `underestimatedCount` return the same thing if users passed this thing into 
>>> a generic operation constrained over Sequence. (The standard library 
>>> already has support wrappers like EnumeratedSequence, so maybe this is 
>>> appropriate.)
>>> 
>>> Another question that would need to be answered is, how should the cases be 
>>> ordered? Declaration order seems obvious and straightforward, but if you 
>>> have a raw-value enum (say, integers), you could have the declaration order 
>>> and the numeric order differ. Maybe that's not a problem. Tying the 
>>> iteration order to declaration order also means that the behavior of a 
>>> program could change simply by reördering the cases. Maybe that's not a big 
>>> problem either, but it's something to call out.
>>> 
>>> If I were designing this, I'd start with the following approach. First, add 
>>> a new protocol to the standard library:
>>> 
>>> ```
>>> public protocol ValueEnumerable {
>>>   associatedtype AllValuesSequence: Sequence where 
>>> AllValuesSequence.Iterator.Element == Self
>>> 
>>>   static var allValues: AllValuesSequence { get }
>>> }
>>> ```
>>> 
>>> Then, for enums that declare conformance to that protocol, synthesize the 
>>> body of `allValues` to return an appropriate sequence. If we imagine a 
>>> model like AnySequence, then the "state" can be the current case, and the 
>>> transition function can be a switch/case that returns it and advances to 
>>> the next one (finally returning nil).
>>> 
>>> There's an opportunity for optimization that may or may not be worth it: if 
>>> the enum is RawRepresentable with RawValue == Int, AND all the raw values 
>>> are in a contiguous range, AND declaration order is numerical order 
>>> (assuming we kept that constraint), then the synthesized state machine can 
>>> just be a simple integer incrementation and call to `init?(rawValue:)`. 
>>> When all the cases have been generated, that will return nil on its own.
>>> 
>>> So that covers enums without associated values. What about those with 
>>> associated values? I would argue that the "number of cases" isn't something 
>>> that's very useful here—if we consider that enum cases are really factory 
>>> functions for concrete values of the type, then we shouldn't think about 
>>> "what are all the cases of this enum" but "what are all the values of this 
>>> type". (For enums without associated values, those are synonymous.)
>>> 
>>> An enum with associated values can potentially have an infinite number of 
>>> values. Here's one:
>>> 
>>> ```
>>> enum BinaryTree {
>>>   case subtree(left: BinaryTree, right: BinaryTree)
>>>   case leaf
>>>   case empty
>>> }
>>> ```
>>> 
>>> Even without introducing an Element type in the leaf nodes, there are a 
>>> countably infinite number of binary trees. So first off, we wouldn't be 
>>> able to generate a meaningful `count` property for that. Since they're 
>>> countably infinite, we *could* theoretically lazily generate a sequence of 
>>> them! It would be a true statement to say "an enum with associated values 
>>> can have all of its values enumerated if all of its associated values are 
>>> also ValueEnumerable". But I don't think that's something we could have the 
>>> compiler synthesize generally: the logic to tie the sequences together 
>>> would be quite complex in the absence of a construct like coroutines/yield, 
>>> and what's worse, the compiler would have to do some deeper analysis to 
>>> avoid infinite recursion. For example, if it used the naïve approach of 
>>> generating the elements in declaration order, it would keep drilling down 
>>> into the `subtree` case above over and over; it really needs to hit the 
>>> base cases first, and requiring the user to order the cases in a certain 
>>> way for it to just work at all is a non-starter.
>>> 
>>> So, enums with associated values are probably left unsynthesized. But the 
>>> interesting thing about having this be a standard protocol is that there 
>>> would be nothing stopping a user from conforming to it and implementing it 
>>> manually, not only for enums but for other types as well. The potential may 
>>> exist for some interesting algorithms by doing that, but I haven't thought 
>>> that far ahead.
>>> 
>>> There are probably some things I'm missing here, but I'd love to hear other 
>>> people's thoughts on it.
>> 
>> There are some things I really like about this approach, but it doesn’t 
>> quite align with a lot of the usage I have seen for manually declared 
>> `allValues` pattern.  
>> 
>> One of the most common ways I have seen `allValues` used is as a 
>> representation of static sections or rows backing table or collection views. 
>>  Code written like this will take the section or item index provided by a 
>> data source or delegate method and index into an `allValues` array to access 
>> the corresponding value.  These methods usually access one or more members 
>> of the value or pass it along to something else (often a cell) which does 
>> so.  
>> 
>> If we introduce synthesis that doesn’t support this use case I think a lot 
>> people will be frustrated so my opinion is that we need to support it.  This 
>> means users need a way to request synthesis of a `Collection` with an `Int` 
>> index.  Obviously doing this solves the `count` problem.  The collection 
>> would not need to be eager.  It could be implemented to produce values on 
>> demand rather than storing them.  
>> 
>> Great points! I was only considering the table view/section case where the 
>> enum had raw values 0..<count, but I do imagine it's possible that someone 
>> could just define `enum Section { case header, content, footer }` and then 
>> want to turn an IndexPath value into the appropriate Section.
>> 
>> On the other hand, though, isn't that what raw value enums are for? If the 
>> user needs to do what you're saying—map specific integers to enum 
>> values—shouldn't they do so by giving those cases raw values and calling 
>> init?(rawValue:), not by indexing into a collection? Especially since they 
>> can already do that today, and the only thing they're missing is being able 
>> to retrieve the count, which a "PrecountedSequence" mentioned above, or 
>> something like it, could also provide.
> 
> First, I’m making observations about what people are doing, not what they 
> could do.  
> 
> Second, the raw value may not correspond to 0-based indices.  It might not 
> even be an Int.  There is no reason to couple this common use case of 
> `allValues` to `Int` raw values with 0-based indices.
> 
> Do we know of any examples where a user is both (1) defining an enum with 
> integer raw values that are noncontiguous or non-zero-based and (2) need 
> declaration-ordinal-based indexing into those cases for other reasons, like a 
> table/collection view? I can't think of why someone would do that, but I'm 
> happy to consider something that I'm missing.


I don’t off-hand, but I don’t think the lack of example is a good motivation 
for a solution that doesn’t directly address the most commonly known use case 
for this feature.

>  
> 
> Third, `init(rawValue:)` is a failable initializer and would require a force 
> unwrap.  If the raw values *are* 0-based integers this is similar to the 
> collection bounds check that would be necessary, but it moves it into user 
> code.  People don’t like writing force unwraps.
> 
> Yeah, this is a really good point that I wasn't fully considering. If other 
> invariants in the application hold—such as table view cell functions never 
> receiving a section index outside 0..<count—then unwrapping it just forces 
> users to address a situation that will never actually occur unless UIKit is 
> fundamentally broken.

Right, but the most crucial point is that it forces *user* to address this.  
They are not required to today.  It is handled by the bounds check in Array.  
This might sound like splitting hairs but I think there are a lot of people who 
wouldn't view it that way.

> 
>  
> 
>> 
>> My main concern with providing a Collection with Int indices is that, at 
>> some fundamental/theoretical level, it feels like it only makes sense for 
>> enums with contiguous numeric raw values. For other kinds of enums, 
>> including those where the enum is just a "bag of things" without raw values, 
>> it feels artificial.
> 
> Sure, that’s why I proposed a couple of options for addressing both use 
> cases.  I think both have merit.  I also think we need to recognize that most 
> people are asking for a replacement for manually writing a static array and 
> won’t be satisfied unless we provide a solution where the synthesized 
> property behaves similarly.
> 
> Agreed—I just wanted to point out the distinction because an important part 
> of fleshing this out will be to partition the various "classes" of enums into 
> those that would receive an indexable Collection vs. those that would receive 
> just a Sequence.

I agree that it’s an important distinction.  To be honest, I’m not sure there 
is a good way to solve both usages without introducing more complexity than 
would be acceptable for something like this.  It might be a problem better 
solved by macros or some other metaprogramming feature.  It would be 
unfortunate to have to wait until we have those to solve this.  However, I 
don’t think it's an important enough problem to deserve a solution with a lot 
of knobs and associated complexity.

> 
> 
> 
>> 
>>  
>> Of course there might be some cases where a manual implementation is 
>> necessary but implementing `Collection` is not desirable for one reason or 
>> another.  One way to solve both of these use cases would be to have a 
>> protocol hierarchy but that seems like it might be excessively complex for a 
>> feature like this.  Another way might be to take advantage of the fact that 
>> in the use case mentioned above people are usually working with the concrete 
>> type.  We could allow the compiler to synthesize an implementation that 
>> *exceeds* the requirement of the protocol such that the synthesized 
>> `AllValuesSequence` is actually a `Collection where Index == Int`.  I’m not 
>> sure which option is better.
>> 
>> I would also like to discuss enums with associated values.  It would 
>> certainly be reasonable to disallow synthesis for these types in an initial 
>> implementation.  I don’t know of any use cases off the top of my head 
>> (although I expect some good ones do exist).  That said, I don’t think 
>> synthesis would be prohibitive for enums with associated values so long as 
>> the type of all associated values conforms to `ValueEnumerable`.  We should 
>> probably support synthesis for these types eventually, possibly in the 
>> initial implementation if there are no significant implementation barriers.
>> 
>> I mentioned some of those barriers above. One issue is that synthesizing the 
>> code to lazily (i.e., reëntrantly) generate a sequence whose elements are 
>> the Cartesian products of other sequences is non-trivial. (Coroutines/yield 
>> would make this a piece of cake.)
> 
> The good news is that we might be in luck on this front in the Swift 5 
> timeframe.  :)
> 
> Fingers crossed! I'm not a concurrency expert by any means, so the most 
> exciting part of those new proposals to me is the side-effect that we might 
> get something like C# enumerators :)
> 
>  
> 
>> 
>> The other is the issue with recursive enums, like the BinaryTree example, 
>> where the compiler has to know to synthesize them in a particular order or 
>> else it will recurse indefinitely before even producing its first value. 
>> However, this could be addressed by simply forbidding automatic synthesis of 
>> enums that have an indirect case, which is probably a reasonable limitation.
> 
> Yeah, that seems like a reasonable limitation.
> 
>> 
>>  
>> 
>> That’s my two cents.
>> 
>> - Matthew
>> 
>>> 
>>> 
>>> On Fri, Sep 8, 2017 at 3:40 AM Jonathan Hull via swift-evolution 
>>> <[email protected] <mailto:[email protected]>> wrote:
>>> +1000
>>> 
>>> I once made a country code enum, and creating that array was simple, but 
>>> took forever, and was prone to mistakes.
>>> 
>>> Thanks,
>>> Jon
>>> 
>>> > On Sep 8, 2017, at 2:56 AM, Logan Shire via swift-evolution 
>>> > <[email protected] <mailto:[email protected]>> wrote:
>>> >
>>> > Googling ‘swift iterate over enum cases’ yields many results of various 
>>> > levels of hackery.
>>> > Obviously it’s trivial to write a computed property that returns an 
>>> > enum’s cases as an
>>> > array, but maintaining that is prone to error. If you add another case, 
>>> > you need to make sure
>>> > you update the array property. For enums without associated types,
>>> > I propose adding a synthesized static var, ‘cases', to the enum’s type. 
>>> > E.g.
>>> >
>>> > enum Suit: String {
>>> >    case spades = "♠"
>>> >    case hearts = "♥"
>>> >    case diamonds = "♦"
>>> >    case clubs = "♣"
>>> > }
>>> >
>>> > let values = (1…13).map { value in
>>> >    switch value {
>>> >    case 1: return “A”
>>> >    case 11: return “J”
>>> >    case 12: return “Q”
>>> >    case 13: return “K”
>>> >    default: return String(value)
>>> >    }
>>> > }
>>> >
>>> > let cards = values.flatMap { value in Suit.cases.map { “\($0)\(value)"  } 
>>> > }
>>> >
>>> > Yields [“♠A”, “ ♥ A”, …, “♣K”]
>>> > Thoughts?
>>> >
>>> >
>>> > Thanks!
>>> > - Logan Shire
>>> > _______________________________________________
>>> > swift-evolution mailing list
>>> > [email protected] <mailto:[email protected]>
>>> > https://lists.swift.org/mailman/listinfo/swift-evolution 
>>> > <https://lists.swift.org/mailman/listinfo/swift-evolution>
>>> 
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected] <mailto:[email protected]>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution 
>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>
>>> _______________________________________________
>>> swift-evolution mailing list
>>> [email protected] <mailto:[email protected]>
>>> https://lists.swift.org/mailman/listinfo/swift-evolution 
>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Reply via email to