Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Matthew Johnson via swift-evolution Fri, 08 Sep 2017 09:44:53 -0700

> On Sep 8, 2017, at 11:32 AM, Tony Allevato <tony.allev...@gmail.com> wrote:
> 
> 
> 
> On Fri, Sep 8, 2017 at 8:35 AM Matthew Johnson <matt...@anandabits.com 
> <mailto:matt...@anandabits.com>> wrote:
>> On Sep 8, 2017, at 9:53 AM, Tony Allevato via swift-evolution 
>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>> 
>> Thanks for bringing this up, Logan! It's something I've been thinking about 
>> a lot lately after a conversation with some colleagues outside of this 
>> community. Some of my thoughts:
>> 
>> AFAIK, there are two major use cases here: (1) you need the whole collection 
>> of cases, like in your example, and (2) you just need the number of cases. 
>> The latter seems to occur somewhat commonly when people want to use an enum 
>> to define the sections of, say, a UITableView. They just return the count 
>> from numberOfSections(in:) and then switch over the cases in their 
>> cell-providing methods.
>> 
>> Because of #2, it would be nice to avoid instantiating the collection 
>> eagerly. (Also because of examples like Jonathan's, where the enum is 
>> large.) If all the user is ever really doing is iterating over them, there's 
>> no need to keep the entire collection in memory. This leads us to look at 
>> Sequence; we could use something like AnySequence to keep the current case 
>> as our state and a transition function to advance to the next one. If a user 
>> needs to instantiate the full array from that sequence they can do so, but 
>> they have to do it explicitly.
>> 
>> The catch is that Sequence only provides `underestimatedCount`, rather than 
>> `count`. Calling the former would be an awkward API (why is it 
>> underestimated? we know how many cases there are). I suppose we could create 
>> a concrete wrapper for Sequence (PrecountedSequence?) that provides a 
>> `count` property to make that cleaner, and then have `underestimatedCount` 
>> return the same thing if users passed this thing into a generic operation 
>> constrained over Sequence. (The standard library already has support 
>> wrappers like EnumeratedSequence, so maybe this is appropriate.)
>> 
>> Another question that would need to be answered is, how should the cases be 
>> ordered? Declaration order seems obvious and straightforward, but if you 
>> have a raw-value enum (say, integers), you could have the declaration order 
>> and the numeric order differ. Maybe that's not a problem. Tying the 
>> iteration order to declaration order also means that the behavior of a 
>> program could change simply by reördering the cases. Maybe that's not a big 
>> problem either, but it's something to call out.
>> 
>> If I were designing this, I'd start with the following approach. First, add 
>> a new protocol to the standard library:
>> 
>> ```
>> public protocol ValueEnumerable {
>>   associatedtype AllValuesSequence: Sequence where 
>> AllValuesSequence.Iterator.Element == Self
>> 
>>   static var allValues: AllValuesSequence { get }
>> }
>> ```
>> 
>> Then, for enums that declare conformance to that protocol, synthesize the 
>> body of `allValues` to return an appropriate sequence. If we imagine a model 
>> like AnySequence, then the "state" can be the current case, and the 
>> transition function can be a switch/case that returns it and advances to the 
>> next one (finally returning nil).
>> 
>> There's an opportunity for optimization that may or may not be worth it: if 
>> the enum is RawRepresentable with RawValue == Int, AND all the raw values 
>> are in a contiguous range, AND declaration order is numerical order 
>> (assuming we kept that constraint), then the synthesized state machine can 
>> just be a simple integer incrementation and call to `init?(rawValue:)`. When 
>> all the cases have been generated, that will return nil on its own.
>> 
>> So that covers enums without associated values. What about those with 
>> associated values? I would argue that the "number of cases" isn't something 
>> that's very useful here—if we consider that enum cases are really factory 
>> functions for concrete values of the type, then we shouldn't think about 
>> "what are all the cases of this enum" but "what are all the values of this 
>> type". (For enums without associated values, those are synonymous.)
>> 
>> An enum with associated values can potentially have an infinite number of 
>> values. Here's one:
>> 
>> ```
>> enum BinaryTree {
>>   case subtree(left: BinaryTree, right: BinaryTree)
>>   case leaf
>>   case empty
>> }
>> ```
>> 
>> Even without introducing an Element type in the leaf nodes, there are a 
>> countably infinite number of binary trees. So first off, we wouldn't be able 
>> to generate a meaningful `count` property for that. Since they're countably 
>> infinite, we *could* theoretically lazily generate a sequence of them! It 
>> would be a true statement to say "an enum with associated values can have 
>> all of its values enumerated if all of its associated values are also 
>> ValueEnumerable". But I don't think that's something we could have the 
>> compiler synthesize generally: the logic to tie the sequences together would 
>> be quite complex in the absence of a construct like coroutines/yield, and 
>> what's worse, the compiler would have to do some deeper analysis to avoid 
>> infinite recursion. For example, if it used the naïve approach of generating 
>> the elements in declaration order, it would keep drilling down into the 
>> `subtree` case above over and over; it really needs to hit the base cases 
>> first, and requiring the user to order the cases in a certain way for it to 
>> just work at all is a non-starter.
>> 
>> So, enums with associated values are probably left unsynthesized. But the 
>> interesting thing about having this be a standard protocol is that there 
>> would be nothing stopping a user from conforming to it and implementing it 
>> manually, not only for enums but for other types as well. The potential may 
>> exist for some interesting algorithms by doing that, but I haven't thought 
>> that far ahead.
>> 
>> There are probably some things I'm missing here, but I'd love to hear other 
>> people's thoughts on it.
> 
> There are some things I really like about this approach, but it doesn’t quite 
> align with a lot of the usage I have seen for manually declared `allValues` 
> pattern.  
> 
> One of the most common ways I have seen `allValues` used is as a 
> representation of static sections or rows backing table or collection views.  
> Code written like this will take the section or item index provided by a data 
> source or delegate method and index into an `allValues` array to access the 
> corresponding value.  These methods usually access one or more members of the 
> value or pass it along to something else (often a cell) which does so.  
> 
> If we introduce synthesis that doesn’t support this use case I think a lot 
> people will be frustrated so my opinion is that we need to support it.  This 
> means users need a way to request synthesis of a `Collection` with an `Int` 
> index.  Obviously doing this solves the `count` problem.  The collection 
> would not need to be eager.  It could be implemented to produce values on 
> demand rather than storing them.  
> 
> Great points! I was only considering the table view/section case where the 
> enum had raw values 0..<count, but I do imagine it's possible that someone 
> could just define `enum Section { case header, content, footer }` and then 
> want to turn an IndexPath value into the appropriate Section.
> 
> On the other hand, though, isn't that what raw value enums are for? If the 
> user needs to do what you're saying—map specific integers to enum 
> values—shouldn't they do so by giving those cases raw values and calling 
> init?(rawValue:), not by indexing into a collection? Especially since they 
> can already do that today, and the only thing they're missing is being able 
> to retrieve the count, which a "PrecountedSequence" mentioned above, or 
> something like it, could also provide.


First, I’m making observations about what people are doing, not what they could 
do.  

Second, the raw value may not correspond to 0-based indices.  It might not even 
be an Int.  There is no reason to couple this common use case of `allValues` to 
`Int` raw values with 0-based indices.

Third, `init(rawValue:)` is a failable initializer and would require a force 
unwrap.  If the raw values *are* 0-based integers this is similar to the 
collection bounds check that would be necessary, but it moves it into user 
code.  People don’t like writing force unwraps.

> 
> My main concern with providing a Collection with Int indices is that, at some 
> fundamental/theoretical level, it feels like it only makes sense for enums 
> with contiguous numeric raw values. For other kinds of enums, including those 
> where the enum is just a "bag of things" without raw values, it feels 
> artificial.

Sure, that’s why I proposed a couple of options for addressing both use cases.  
I think both have merit.  I also think we need to recognize that most people 
are asking for a replacement for manually writing a static array and won’t be 
satisfied unless we provide a solution where the synthesized property behaves 
similarly.

> 
>  
> Of course there might be some cases where a manual implementation is 
> necessary but implementing `Collection` is not desirable for one reason or 
> another.  One way to solve both of these use cases would be to have a 
> protocol hierarchy but that seems like it might be excessively complex for a 
> feature like this.  Another way might be to take advantage of the fact that 
> in the use case mentioned above people are usually working with the concrete 
> type.  We could allow the compiler to synthesize an implementation that 
> *exceeds* the requirement of the protocol such that the synthesized 
> `AllValuesSequence` is actually a `Collection where Index == Int`.  I’m not 
> sure which option is better.
> 
> I would also like to discuss enums with associated values.  It would 
> certainly be reasonable to disallow synthesis for these types in an initial 
> implementation.  I don’t know of any use cases off the top of my head 
> (although I expect some good ones do exist).  That said, I don’t think 
> synthesis would be prohibitive for enums with associated values so long as 
> the type of all associated values conforms to `ValueEnumerable`.  We should 
> probably support synthesis for these types eventually, possibly in the 
> initial implementation if there are no significant implementation barriers.
> 
> I mentioned some of those barriers above. One issue is that synthesizing the 
> code to lazily (i.e., reëntrantly) generate a sequence whose elements are the 
> Cartesian products of other sequences is non-trivial. (Coroutines/yield would 
> make this a piece of cake.)

The good news is that we might be in luck on this front in the Swift 5 
timeframe.  :)

> 
> The other is the issue with recursive enums, like the BinaryTree example, 
> where the compiler has to know to synthesize them in a particular order or 
> else it will recurse indefinitely before even producing its first value. 
> However, this could be addressed by simply forbidding automatic synthesis of 
> enums that have an indirect case, which is probably a reasonable limitation.

Yeah, that seems like a reasonable limitation.

> 
>  
> 
> That’s my two cents.
> 
> - Matthew
> 
>> 
>> 
>> On Fri, Sep 8, 2017 at 3:40 AM Jonathan Hull via swift-evolution 
>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>> +1000
>> 
>> I once made a country code enum, and creating that array was simple, but 
>> took forever, and was prone to mistakes.
>> 
>> Thanks,
>> Jon
>> 
>> > On Sep 8, 2017, at 2:56 AM, Logan Shire via swift-evolution 
>> > <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
>> >
>> > Googling ‘swift iterate over enum cases’ yields many results of various 
>> > levels of hackery.
>> > Obviously it’s trivial to write a computed property that returns an enum’s 
>> > cases as an
>> > array, but maintaining that is prone to error. If you add another case, 
>> > you need to make sure
>> > you update the array property. For enums without associated types,
>> > I propose adding a synthesized static var, ‘cases', to the enum’s type. 
>> > E.g.
>> >
>> > enum Suit: String {
>> >    case spades = "♠"
>> >    case hearts = "♥"
>> >    case diamonds = "♦"
>> >    case clubs = "♣"
>> > }
>> >
>> > let values = (1…13).map { value in
>> >    switch value {
>> >    case 1: return “A”
>> >    case 11: return “J”
>> >    case 12: return “Q”
>> >    case 13: return “K”
>> >    default: return String(value)
>> >    }
>> > }
>> >
>> > let cards = values.flatMap { value in Suit.cases.map { “\($0)\(value)"  } }
>> >
>> > Yields [“♠A”, “ ♥ A”, …, “♣K”]
>> > Thoughts?
>> >
>> >
>> > Thanks!
>> > - Logan Shire
>> > _______________________________________________
>> > swift-evolution mailing list
>> > swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> > https://lists.swift.org/mailman/listinfo/swift-evolution 
>> > <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution 
>> <https://lists.swift.org/mailman/listinfo/swift-evolution>
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org <mailto:swift-evolution@swift.org>
>> https://lists.swift.org/mailman/listinfo/swift-evolution 
>> <https://lists.swift.org/mailman/listinfo/swift-evolution>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [Pitch] Synthesized static enum property to iterate over cases

Reply via email to