Re: [swift-evolution] [RFC] Associated type inference

Douglas Gregor via swift-evolution Sat, 02 Dec 2017 14:40:55 -0800


Sent from my iPhone


> On Dec 2, 2017, at 1:37 PM, Matthew Johnson <matt...@anandabits.com> wrote:
> 
> 
>> On Nov 30, 2017, at 6:28 PM, Douglas Gregor via swift-evolution 
>> <swift-evolution@swift.org> wrote:
>> 
>> Hello Swift community,
>> 
>> Associated type inference, which is the process by which the Swift compiler 
>> attempts to infer typealiases to satisfy associated-type requirements based 
>> on other requirements, has caused both implementation problems and user 
>> confusion for a long time. Some of you might remember a previous (failed) 
>> attempt to remove this feature from the Swift language, in SE-0108 “Remove 
>> associated type inference”. 
>> 
>> I’m not sure we can remove this feature outright (i.e., the concerns that 
>> sank that proposal are still absolutely valid), because it is so very 
>> convenient and a lot of user code likely depends on it in some form or 
>> other. So, here I’d like to describe the uses of the feature, its current 
>> (very problematic) implementation approach, and a half-baked proposal to 
>> narrow the scope of associated type inference to something that I think is 
>> more tractable. I need help with the design of this feature, because I feel 
>> like it’s going to be a delicate balance between implementability and 
>> expressiveness.
>> 
>> A Motivating Example
>> As a motivating example, let’s consider a “minimal” random access collection:
>> 
>> struct MyCollection<T> {
>>     var contents: [T]
>> }
>> 
>> extension MyCollection: RandomAccessCollection {    
>>     var startIndex: Int { return contents.startIndex }
>>     var endIndex: Int { return contents.endIndex }
>>     subscript(index: Int) -> T { return contents[index] }
>> }
>> 
>> This is actually pretty awesome: by providing just two properties and a 
>> subscript, we get the full breadth of the random access collection API! This 
>> is relying heavily on associated type inference (for associated type 
>> requirements) and default implementations specified on protocol extensions. 
>> Without associated type inference, we would have had to write:
>> 
>> 
>> extension MyCollection: RandomAccessCollection {
>>     typealias Element = T
>>     typealias Index = Int
>>     typealias Indices = CountableRange<Int>
>>     typealias Iterator = IndexingIterator<MyCollection<T>>
>>     typealias SubSequence = RandomAccessSlice<MyCollection<T>>
>>     
>>     var startIndex: Int { return contents.startIndex }
>>     var endIndex: Int { return contents.endIndex }
>>     subscript(index: Int) -> T { return contents[index] }
>> }
>> 
>> where the bolded typealiases are currently inferred. It was worse back when 
>> we reviewed SE-0108, because IIRC there were a few underscored associated 
>> types (e.g., _Element) that have since been removed. Still, that’s a bit of 
>> additional work to define a “minimal” collection, and requires quite a bit 
>> more understanding: how do I know to choose IndexingIterator, and 
>> CountableRange, and RandomAccessSlice?
>> 
>> The issue would get worse with, e.g., SE-0174 “Change filter to return an 
>> associated type”, which adds an associated type Filtered that almost nobody 
>> will ever customize, and isn’t really fundamental to the way collections 
>> work. Adding Filtered to the standard library would be a source-breaking 
>> change, because users would have to write a typealias giving it its default.
>> 
>> Associated Type Defaults
>> One of the ways in which we avoid having to specify typealiases is to use 
>> associated type defaults. For example, the standard library contains 
>> associated type defaults for Indices, Iterator, and SubSequence. Let’s focus 
>> on Indices:
>> 
>> protocol Collection : Sequence {
>>   associatedtype Indices = DefaultIndices<Self>
>>   // ...
>> }
>> 
>> protocol BidirectionalCollection : Collection {
>>   associatedtype Indices = DefaultBidirectionalIndices<Self>
>>   // ...
>> }
>> 
>> protocol RandomAccessCollection : BidirectionalCollection {
>>   associatedtype Indices = DefaultRandomAccessIndices<Self>
>>   // ...
>> }
>> 
>> The basic idea here is that different protocols in the hierarchy provide 
>> different defaults, and you presumably want the default from the most 
>> specific protocol. If I define a type and make it conform to 
>> BidirectionalCollection, I’d expect to get DefaultBidirectionalIndices<Self> 
>> for Indices. If a define a type and make it conform to RandomAccessIterator, 
>> I’d expect to get DefaultRandomAccessIndices<Self>.
>> 
>> (Aside: DefaultRandomAccessIndices and DefaultBidirectionalIndices got 
>> collapsed into DefaultIndices now that we have conditional conformances for 
>> the standard library, but the issues I’m describing remain).
>> 
>> Associated type defaults seem like a reasonable feature that fits well 
>> enough into the design. However, it’s not the only thing in place with our 
>> MyCollection example, for which Indices was inferred to CountableRange. 
>> How’s that happen?
>> 
>> Associated Type Inference
>> Associated type inference attempts to look at the requirements of a 
>> protocol, and then looks into the conforming type for declarations that 
>> might satisfy those requirements, and infers associated types from the types 
>> of those declarations. Let’s look at some examples. RandomAccessCollection 
>> has some requirements that mention the Index and Element types:
>> 
>> protocol RandomAccessCollection : BidirectionalCollection {
>>   associatedtype Element
>>   associatedtype Index
>> 
>>   var startIndex: Index { get }
>>   var endIndex: Index { get }
>>   subscript (i: Index) -> Element { get }
>> }
>> 
>> Associated type inference wil try to satisfy those requirements for 
>> MyCollection, and will find these declarations:
>> 
>> extension MyCollection: RandomAccessCollection {    
>>     var startIndex: Int { return contents.startIndex }
>>     var endIndex: Int { return contents.endIndex }
>>     subscript(index: Int) -> T { return contents[index] }
>> }
>> 
>> and match them up. From startIndex, we infer Index := Int. From endIndex, we 
>> infer Index := Int. From subscript, we infer Index := Int and Element := T. 
>> Those results are all consistent, so we’ve properly inferred Index and 
>> Element. Yay.
>> 
>> Associated type inference often has to deal with ambiguities. For example, 
>> there is an extension of Collection that provides a range subscript operator:
>> 
>> extension Collection {
>>   subscript (bounds: Range<Index>) -> Slice<Self> { … }
>> }
>> 
>> When we look at that and match it to the subscript requirement in 
>> RandomAccessCollection (remembering that argument labels aren’t significant 
>> for subscripts by default!), we infer Index := Range<Index> (admittedly 
>> weird) and Element := Slice<Self> (could be legitimate). We have to discard 
>> this candidate because the deduction from startIndex/endIndex (Index := Int) 
>> collides with this deduction (Index := Range<Index>).
>> 
>> In our initial example, we saw that Indices was inferred to 
>> CountableRange<Int>. Let’s look at another slice of the 
>> RandomAccessCollection protocol that’s relevant to this associated type:
>> 
>> protocol RandomAccessCollection : BidirectionalCollection {
>>   associatedtype Index
>>   associatedtype Indices: RandomAccessCollection where Indices.Element == 
>> Index
>> 
>>   var indices: Indices { get }
>> }
>> 
>> We will match that requirement for an “indices” property against a member of 
>> a constrained protocol extension of RandomAccessCollection:
>> 
>> extension RandomAccessCollection where Index : Strideable, Index.Stride == 
>> IndexDistance, Indices == CountableRange<Index> {
>>   public var indices: CountableRange<Index> {
>>     return startIndex..<endIndex
>>   }
>> }
>> 
>> Associated type inference can determine here that Indices := 
>> CountableRange<Index>, but only when Index: Strideable and Index.Stride == 
>> IndexDistance. In other words, there are a whole bunch of other constraints 
>> to verify before we can accept this inference for Indices.
>> 
>> 
>> What’s a Good Solution Look Like?
>> Our current system for associated type inference and associated type 
>> defaults is buggy and complicated. The compiler gets it right often enough 
>> that people depend on it, but I don’t think anyone can reasonably be 
>> expected to puzzle out what’s going to happen, and this area is rife with 
>> bugs. If we were to design a new solution from scratch, what properties 
>> should it have?
>> 
>> It should allow the author of a protocol to provide reasonable defaults, so 
>> the user doesn’t have to write them
>> It shouldn’t require users to write typealiases for “obvious” cases, even 
>> when they aren’t due to defaults
>> It shouldn’t infer an inconsistent set of typealiases
>> It should be something that a competent Swift programmer could reason about 
>> when it will succeed, when and why it will fail, and what the resulting 
>> inferred typealiases would be
>> It should admit a reasonable implementation in the compiler that is 
>> performant and robust
>> 
>> A Rough Proposal
>> I’ve been thinking about this for a bit, and I think there are three ways in 
>> which we should be able to infer an associated type witness:
>> 
>> Associated type defaults, which are specified with the associated type 
>> itself, e.g.,
>> 
>>   associatedtype Indices = DefaultIndices<Self>
>> 
>> These are easy to reason about for both the programmer and the compiler.
>> Typealiases in (possibly constrained) protocol extensions, e.g.,
>> 
>>   extension RandomAccessCollection where Index : Strideable, Index.Stride == 
>> IndexDistance {
>>     typealias RandomAccessCollection.Indices = CountableRange<Index>
>>   }
>> 
>> I’m intentionally using some odd ‘.’ syntax here to indicate that this 
>> typealias is intended only to be found when trying to satisfy an associated 
>> type requirement, and is not a general typealias that could be found by 
>> normal name lookup. Let’s set the syntax bike shed aside for the moment. The 
>> primary advantage of this approach (vs. inferring Indices from “var Indices: 
>> CountableRange<Index>” in a constrained protocol extension) is that there’s 
>> a real typealias declaration that compiler and programmer alike can look at 
>> and reason about based just on the name “Indices”. 
>> 
>> Note that this mechanism technically obviates the need for (1), in the same 
>> sense that default implementations in protocols are merely syntactic sugar.
>> Declarations within the nominal type declaration or extension that declares 
>> conformance to the protocol in question. This is generally the same approach 
>> as described in “associated type inference” above, where we match 
>> requirements of the protocol against declarations that could satisfy those 
>> requirements and infer associated types from there. However, I want to turn 
>> it around: instead of starting with the requirements of the protocol any 
>> looking basically anywhere in the type or any protocol to which it conforms 
>> (for implementations in protocol extensions), start with the declarations 
>> that the user explicitly wrote at the point of the conformance and look for 
>> requirements they might satisfy. For example, consider our initial example:
>> 
>>   extension MyCollection: RandomAccessCollection {    
>>     var startIndex: Int { return contents.startIndex }
>>     var endIndex: Int { return contents.endIndex }
>>     subscript(index: Int) -> T { return contents[index] }
>>   }
>> 
>> Since startIndex, endIndex, and subscript(_:) are declared in the same 
>> extension that declares conformance to RandomAccessIterator, we should look 
>> for requirements with the same name as these properties and subscript within 
>> RandomAccessCollection (or any protocol it inherits) and infer Index := Int 
>> and Element := T by matching the type signatures. This is still the most 
>> magical inference rule, because there is no declaration named “Index” or 
>> “Element” to look at. However, it is much narrower in scope than the current 
>> implementation, because it’s only going to reason from the (probably small) 
>> set of declarations that the user wrote alongside the conformance, so it’s 
>> more likely to be intentional. Note that this is again nudging programmers 
>> toward the style of programming where one puts one protocol conformance per 
>> extension, which is admittedly my personal preference.
>> 
>> Thoughts?
>> I think this approach is more predictable and more implementable than the 
>> current model. I’m curious whether the above makes sense to someone other 
>> than me, and whether it covers existing use cases well enough. Thoughts?
> 
> Hi Doug.  I’ve had a chance to give this some thought and spent a little bit 
> of time looking at some code.  This approach would be sufficient for some use 
> cases I’ve been having trouble with lately.  

Thank you for staying this further!

> 
> One other heuristic that may make sense comes to mind: piggyback on other 
> conformances that are visible where the conformance in question is declared.  
> If an associated type can’t be inferred where T: P is declared but an 
> associated type with the same name is inferred in another conformance of T 
> visible from the T: P declaration use that type.  I’ll leave it up to you to 
> evaluate if and when this might be worth considering, you’re the expert!

Yes, this is a good point. The tricky thing here is that it’s a different kind 
of “global” inference, where at worst we are considering all of the protocols 
to which a given type conforms at once... but I suspect we have to do something 
along these lines. 

  - Doug

> 
>> 
>>      - Doug
>> 
>> _______________________________________________
>> swift-evolution mailing list
>> swift-evolution@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [RFC] Associated type inference

Reply via email to