Re: [swift-evolution] [RFC] Associated type inference

Dave Abrahams via swift-evolution Sat, 02 Dec 2017 21:23:55 -0800

> On Nov 30, 2017, at 2:28 PM, Douglas Gregor via swift-evolution 
> <swift-evolution@swift.org> wrote:
> 
> Hello Swift community,
> 
> Associated type inference, which is the process by which the Swift compiler 
> attempts to infer typealiases to satisfy associated-type requirements based 
> on other requirements, has caused both implementation problems and user 
> confusion for a long time. Some of you might remember a previous (failed) 
> attempt to remove this feature from the Swift language, in SE-0108 “Remove 
> associated type inference”. 
> 
> I’m not sure we can remove this feature outright (i.e., the concerns that 
> sank that proposal are still absolutely valid), because it is so very 
> convenient and a lot of user code likely depends on it in some form or other. 
> So, here I’d like to describe the uses of the feature, its current (very 
> problematic) implementation approach, and a half-baked proposal to narrow the 
> scope of associated type inference to something that I think is more 
> tractable. I need help with the design of this feature, because I feel like 
> it’s going to be a delicate balance between implementability and 
> expressiveness.


Aloha, Doug!

> 
> A Motivating Example
> As a motivating example, let’s consider a “minimal” random access collection:
> 
> struct MyCollection<T> {
>     var contents: [T]
> }
> 
> extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> This is actually pretty awesome: by providing just two properties and a 
> subscript, we get the full breadth of the random access collection API! This 
> is relying heavily on associated type inference (for associated type 
> requirements) and default implementations specified on protocol extensions. 
> Without associated type inference, we would have had to write:
> 
> 
> extension MyCollection: RandomAccessCollection {
>     typealias Element = T
>     typealias Index = Int
>     typealias Indices = CountableRange<Int>
>     typealias Iterator = IndexingIterator<MyCollection<T>>
>     typealias SubSequence = RandomAccessSlice<MyCollection<T>>
>     
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> where the bolded typealiases are currently inferred. It was worse back when 
> we reviewed SE-0108, because IIRC there were a few underscored associated 
> types (e.g., _Element) that have since been removed. Still, that’s a bit of 
> additional work to define a “minimal” collection, and requires quite a bit 
> more understanding: how do I know to choose IndexingIterator, and 
> CountableRange, and RandomAccessSlice?
> 
> The issue would get worse with, e.g., SE-0174 “Change filter to return an 
> associated type”, which adds an associated type Filtered that almost nobody 
> will ever customize, and isn’t really fundamental to the way collections 
> work. Adding Filtered to the standard library would be a source-breaking 
> change, because users would have to write a typealias giving it its default.
> 
> Associated Type Defaults
> One of the ways in which we avoid having to specify typealiases is to use 
> associated type defaults. For example, the standard library contains 
> associated type defaults for Indices, Iterator, and SubSequence. Let’s focus 
> on Indices:
> 
> protocol Collection : Sequence {
>   associatedtype Indices = DefaultIndices<Self>
>   // ...
> }
> 
> protocol BidirectionalCollection : Collection {
>   associatedtype Indices = DefaultBidirectionalIndices<Self>
>   // ...
> }
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Indices = DefaultRandomAccessIndices<Self>
>   // ...
> }
> 
> The basic idea here is that different protocols in the hierarchy provide 
> different defaults, and you presumably want the default from the most 
> specific protocol. If I define a type and make it conform to 
> BidirectionalCollection, I’d expect to get DefaultBidirectionalIndices<Self> 
> for Indices. If a define a type and make it conform to RandomAccessIterator, 
> I’d expect to get DefaultRandomAccessIndices<Self>.
> 
> (Aside: DefaultRandomAccessIndices and DefaultBidirectionalIndices got 
> collapsed into DefaultIndices now that we have conditional conformances for 
> the standard library, but the issues I’m describing remain).
> 
> Associated type defaults seem like a reasonable feature that fits well enough 
> into the design. However, it’s not the only thing in place with our 
> MyCollection example, for which Indices was inferred to CountableRange. How’s 
> that happen?
> 
> Associated Type Inference
> Associated type inference attempts to look at the requirements of a protocol, 
> and then looks into the conforming type for declarations that might satisfy 
> those requirements, and infers associated types from the types of those 
> declarations. Let’s look at some examples. RandomAccessCollection has some 
> requirements that mention the Index and Element types:
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Element
>   associatedtype Index
> 
>   var startIndex: Index { get }
>   var endIndex: Index { get }
>   subscript (i: Index) -> Element { get }
> }
> 
> Associated type inference wil try to satisfy those requirements for 
> MyCollection, and will find these declarations:
> 
> extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
> }
> 
> and match them up. From startIndex, we infer Index := Int. From endIndex, we 
> infer Index := Int. From subscript, we infer Index := Int and Element := T. 
> Those results are all consistent, so we’ve properly inferred Index and 
> Element. Yay.
> 
> Associated type inference often has to deal with ambiguities. For example, 
> there is an extension of Collection that provides a range subscript operator:
> 
> extension Collection {
>   subscript (bounds: Range<Index>) -> Slice<Self> { … }
> }
> 
> When we look at that and match it to the subscript requirement in 
> RandomAccessCollection (remembering that argument labels aren’t significant 
> for subscripts by default!), we infer Index := Range<Index> (admittedly 
> weird) and Element := Slice<Self> (could be legitimate). We have to discard 
> this candidate because the deduction from startIndex/endIndex (Index := Int) 
> collides with this deduction (Index := Range<Index>).
> 
> In our initial example, we saw that Indices was inferred to 
> CountableRange<Int>. Let’s look at another slice of the 
> RandomAccessCollection protocol that’s relevant to this associated type:
> 
> protocol RandomAccessCollection : BidirectionalCollection {
>   associatedtype Index
>   associatedtype Indices: RandomAccessCollection where Indices.Element == 
> Index
> 
>   var indices: Indices { get }
> }
> 
> We will match that requirement for an “indices” property against a member of 
> a constrained protocol extension of RandomAccessCollection:
> 
> extension RandomAccessCollection where Index : Strideable, Index.Stride == 
> IndexDistance, Indices == CountableRange<Index> {
>   public var indices: CountableRange<Index> {
>     return startIndex..<endIndex
>   }
> }
> 
> Associated type inference can determine here that Indices := 
> CountableRange<Index>, but only when Index: Strideable and Index.Stride == 
> IndexDistance. In other words, there are a whole bunch of other constraints 
> to verify before we can accept this inference for Indices.

So, what’s the problem? ;-)

> What’s a Good Solution Look Like?
> Our current system for associated type inference and associated type defaults 
> is buggy and complicated.

Well, that’s the problem, then.  Don’t worry, I won’t suggest that you simply 
fix the implementation, because even if there weren’t bugs and the system were 
predictable I’d still think we could improve the situation for users by making 
associated type default declarations more explicit.

> The compiler gets it right often enough that people depend on it, but I don’t 
> think anyone can reasonably be expected to puzzle out what’s going to happen, 
> and this area is rife with bugs. If we were to design a new solution from 
> scratch, what properties should it have?
> 
> It should allow the author of a protocol to provide reasonable defaults, so 
> the user doesn’t have to write them
> It shouldn’t require users to write typealiases for “obvious” cases, even 
> when they aren’t due to defaults
> It shouldn’t infer an inconsistent set of typealiases
> It should be something that a competent Swift programmer could reason about 
> when it will succeed, when and why it will fail, and what the resulting 
> inferred typealiases would be
> It should admit a reasonable implementation in the compiler that is 
> performant and robust
• It should cover all of the existing use cases.
• It should not break code at this point.
• We should have a migration strategy for existing code that avoids traps like 
silent semantic changes.

My bullet is important to me; I don’t think existing use cases are (inherently) 
so complex that we can sacrifice almost any of them and still end up with a 
sufficiently useful system.  At the very least, existing use cases provide the 
only guidance we really have as to what the feature should do.

I think we need to acknowledge that my second bullet is unattainable, at least 
if we want to improve type checking performance. Not breaking any code means 
that given any existing code, the compiler would have to explore the same 
solution space it currently does, and come up with the same answers.  Improving 
performance would require new  declarations to use totally optional explicit 
syntax to prevent some explorations, and that’s an untenable user experience.

Which brings me to my third bullet: unless we are willing to break the code of 
protocol users (as opposed to vendors) we need to ensure that vendors can 
confidently convert code to use the new system without changing semantics.
 
> 
> A Rough Proposal
> I’ve been thinking about this for a bit, and I think there are three ways in 
> which we should be able to infer an associated type witness:
> 
> Associated type defaults, which are specified with the associated type 
> itself, e.g.,
> 
>   associatedtype Indices = DefaultIndices<Self>
> 
> These are easy to reason about for both the programmer and the compiler.
> Typealiases in (possibly constrained) protocol extensions, e.g.,
> 
>   extension RandomAccessCollection where Index : Strideable, Index.Stride == 
> IndexDistance {
>     typealias RandomAccessCollection.Indices = CountableRange<Index>
>   }
> 
> I’m intentionally using some odd ‘.’ syntax here to indicate that this 
> typealias is intended only to be found when trying to satisfy an associated 
> type requirement, and is not a general typealias that could be found by 
> normal name lookup. Let’s set the syntax bike shed aside for the moment. The 
> primary advantage of this approach (vs. inferring Indices from “var Indices: 
> CountableRange<Index>” in a constrained protocol extension) is that there’s a 
> real typealias declaration that compiler and programmer alike can look at and 
> reason about based just on the name “Indices”. 
> 
> Note that this mechanism technically obviates the need for (1), in the same 
> sense that default implementations in protocols are merely syntactic sugar.
> Declarations within the nominal type declaration or extension that declares 
> conformance to the protocol in question. This is generally the same approach 
> as described in “associated type inference” above, where we match 
> requirements of the protocol against declarations that could satisfy those 
> requirements and infer associated types from there. However, I want to turn 
> it around: instead of starting with the requirements of the protocol any 
> looking basically anywhere in the type or any protocol to which it conforms 
> (for implementations in protocol extensions), start with the declarations 
> that the user explicitly wrote at the point of the conformance and look for 
> requirements they might satisfy. For example, consider our initial example:
> 
>   extension MyCollection: RandomAccessCollection {    
>     var startIndex: Int { return contents.startIndex }
>     var endIndex: Int { return contents.endIndex }
>     subscript(index: Int) -> T { return contents[index] }
>   }
> 
> Since startIndex, endIndex, and subscript(_:) are declared in the same 
> extension that declares conformance to RandomAccessIterator, we should look 
> for requirements with the same name as these properties and subscript within 
> RandomAccessCollection (or any protocol it inherits) and infer Index := Int 
> and Element := T by matching the type signatures. This is still the most 
> magical inference rule, because there is no declaration named “Index” or 
> “Element” to look at. However, it is much narrower in scope than the current 
> implementation, because it’s only going to reason from the (probably small) 
> set of declarations that the user wrote alongside the conformance, so it’s 
> more likely to be intentional. Note that this is again nudging programmers 
> toward the style of programming where one puts one protocol conformance per 
> extension, which is admittedly my personal preference.
> 
> Thoughts?

The thing that strikes me most about these is that the first two are explicit 
declarations of intent: “In the absences of an explicit declaration, deduce 
this associated type as follows,” while the third is still extremely indirect.  
While it hints to the compiler about which conformances’ associated type 
requirements we are trying to satisfy, it never comes out and says straight out 
what the associated type should be, even though it needs to be mentioned.  As a 
generic programmer, I don’t value the concision gained over the clarity lost, 
and I’d like to see the solutions to these problems follow the 
explicit-declaration-of-intent pattern.  However, the code in #3 is not written 
by the protocol vendor, and for me it is (at least currently) a stretch to 
think of breaking the code of protocol users, so I grudgingly accept it.  

If we were really starting from scratch I might suggest requiring that 
conformances use the associated type name rather than some concrete type, e.g. 

extension MyCollection: RandomAccessCollection {    
    typealias RandomAccessCollection.Index = Int
    var startIndex: Index { return contents.startIndex }
    var endIndex: Index { return contents.endIndex }
    subscript(index: Index) -> Element { return contents[index] }
  }

But I suspect we’re well past the point in the language’s evolution where that 
sort of change is possible.

As for migration of protocol user code, I think we’d need to run both the new 
and the old slow inference in the migrator and flag any differences.  I don’t 
know what to do about protocol vendors’ code though.

> I think this approach is more predictable and more implementable than the 
> current model. I’m curious whether the above makes sense to someone other 
> than me, and whether it covers existing use cases well enough. Thoughts?

Well, covering the use cases is definitely still a concern for me.  I don’t 
think we’ll know for sure until we try it, but have you thought about how to 
migrate each piece of code in the standard library?  Does it cover those cases?

-Dave



Sent from my iPad

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] [RFC] Associated type inference

Reply via email to