Hello Swift community,

Associated type inference, which is the process by which the Swift compiler 
attempts to infer typealiases to satisfy associated-type requirements based on 
other requirements, has caused both implementation problems and user confusion 
for a long time. Some of you might remember a previous (failed) attempt to 
remove this feature from the Swift language, in SE-0108 “Remove associated type 

I’m not sure we can remove this feature outright (i.e., the concerns that sank 
that proposal are still absolutely valid), because it is so very convenient and 
a lot of user code likely depends on it in some form or other. So, here I’d 
like to describe the uses of the feature, its current (very problematic) 
implementation approach, and a half-baked proposal to narrow the scope of 
associated type inference to something that I think is more tractable. I need 
help with the design of this feature, because I feel like it’s going to be a 
delicate balance between implementability and expressiveness.

A Motivating Example
As a motivating example, let’s consider a “minimal” random access collection:

struct MyCollection<T> {
    var contents: [T]

extension MyCollection: RandomAccessCollection {    
    var startIndex: Int { return contents.startIndex }
    var endIndex: Int { return contents.endIndex }
    subscript(index: Int) -> T { return contents[index] }

This is actually pretty awesome: by providing just two properties and a 
subscript, we get the full breadth of the random access collection API! This is 
relying heavily on associated type inference (for associated type requirements) 
and default implementations specified on protocol extensions. Without 
associated type inference, we would have had to write:

extension MyCollection: RandomAccessCollection {
    typealias Element = T
    typealias Index = Int
    typealias Indices = CountableRange<Int>
    typealias Iterator = IndexingIterator<MyCollection<T>>
    typealias SubSequence = RandomAccessSlice<MyCollection<T>>
    var startIndex: Int { return contents.startIndex }
    var endIndex: Int { return contents.endIndex }
    subscript(index: Int) -> T { return contents[index] }

where the bolded typealiases are currently inferred. It was worse back when we 
reviewed SE-0108, because IIRC there were a few underscored associated types 
(e.g., _Element) that have since been removed. Still, that’s a bit of 
additional work to define a “minimal” collection, and requires quite a bit more 
understanding: how do I know to choose IndexingIterator, and CountableRange, 
and RandomAccessSlice?

The issue would get worse with, e.g., SE-0174 “Change filter to return an 
associated type” 
 which adds an associated type Filtered that almost nobody will ever customize, 
and isn’t really fundamental to the way collections work. Adding Filtered to 
the standard library would be a source-breaking change, because users would 
have to write a typealias giving it its default.

Associated Type Defaults
One of the ways in which we avoid having to specify typealiases is to use 
associated type defaults. For example, the standard library contains associated 
type defaults for Indices, Iterator, and SubSequence. Let’s focus on Indices:

protocol Collection : Sequence {
  associatedtype Indices = DefaultIndices<Self>
  // ...

protocol BidirectionalCollection : Collection {
  associatedtype Indices = DefaultBidirectionalIndices<Self>
  // ...

protocol RandomAccessCollection : BidirectionalCollection {
  associatedtype Indices = DefaultRandomAccessIndices<Self>
  // ...

The basic idea here is that different protocols in the hierarchy provide 
different defaults, and you presumably want the default from the most specific 
protocol. If I define a type and make it conform to BidirectionalCollection, 
I’d expect to get DefaultBidirectionalIndices<Self> for Indices. If a define a 
type and make it conform to RandomAccessIterator, I’d expect to get 

(Aside: DefaultRandomAccessIndices and DefaultBidirectionalIndices got 
collapsed into DefaultIndices now that we have conditional conformances for the 
standard library <https://github.com/apple/swift/pull/12913>, but the issues 
I’m describing remain).

Associated type defaults seem like a reasonable feature that fits well enough 
into the design. However, it’s not the only thing in place with our 
MyCollection example, for which Indices was inferred to CountableRange. How’s 
that happen?

Associated Type Inference
Associated type inference attempts to look at the requirements of a protocol, 
and then looks into the conforming type for declarations that might satisfy 
those requirements, and infers associated types from the types of those 
declarations. Let’s look at some examples. RandomAccessCollection has some 
requirements that mention the Index and Element types:

protocol RandomAccessCollection : BidirectionalCollection {
  associatedtype Element
  associatedtype Index

  var startIndex: Index { get }
  var endIndex: Index { get }
  subscript (i: Index) -> Element { get }

Associated type inference wil try to satisfy those requirements for 
MyCollection, and will find these declarations:

extension MyCollection: RandomAccessCollection {    
    var startIndex: Int { return contents.startIndex }
    var endIndex: Int { return contents.endIndex }
    subscript(index: Int) -> T { return contents[index] }

and match them up. From startIndex, we infer Index := Int. From endIndex, we 
infer Index := Int. From subscript, we infer Index := Int and Element := T. 
Those results are all consistent, so we’ve properly inferred Index and Element. 

Associated type inference often has to deal with ambiguities. For example, 
there is an extension of Collection that provides a range subscript operator:

extension Collection {
  subscript (bounds: Range<Index>) -> Slice<Self> { … }

When we look at that and match it to the subscript requirement in 
RandomAccessCollection (remembering that argument labels aren’t significant for 
subscripts by default!), we infer Index := Range<Index> (admittedly weird) and 
Element := Slice<Self> (could be legitimate). We have to discard this candidate 
because the deduction from startIndex/endIndex (Index := Int) collides with 
this deduction (Index := Range<Index>).

In our initial example, we saw that Indices was inferred to 
CountableRange<Int>. Let’s look at another slice of the RandomAccessCollection 
protocol that’s relevant to this associated type:

protocol RandomAccessCollection : BidirectionalCollection {
  associatedtype Index
  associatedtype Indices: RandomAccessCollection where Indices.Element == Index

  var indices: Indices { get }

We will match that requirement for an “indices” property against a member of a 
constrained protocol extension of RandomAccessCollection:

extension RandomAccessCollection where Index : Strideable, Index.Stride == 
IndexDistance, Indices == CountableRange<Index> {
  public var indices: CountableRange<Index> {
    return startIndex..<endIndex

Associated type inference can determine here that Indices := 
CountableRange<Index>, but only when Index: Strideable and Index.Stride == 
IndexDistance. In other words, there are a whole bunch of other constraints to 
verify before we can accept this inference for Indices.

What’s a Good Solution Look Like?
Our current system for associated type inference and associated type defaults 
is buggy and complicated. The compiler gets it right often enough that people 
depend on it, but I don’t think anyone can reasonably be expected to puzzle out 
what’s going to happen, and this area is rife with bugs. If we were to design a 
new solution from scratch, what properties should it have?

It should allow the author of a protocol to provide reasonable defaults, so the 
user doesn’t have to write them
It shouldn’t require users to write typealiases for “obvious” cases, even when 
they aren’t due to defaults
It shouldn’t infer an inconsistent set of typealiases
It should be something that a competent Swift programmer could reason about 
when it will succeed, when and why it will fail, and what the resulting 
inferred typealiases would be
It should admit a reasonable implementation in the compiler that is performant 
and robust

A Rough Proposal
I’ve been thinking about this for a bit, and I think there are three ways in 
which we should be able to infer an associated type witness:

Associated type defaults, which are specified with the associated type itself, 

  associatedtype Indices = DefaultIndices<Self>

These are easy to reason about for both the programmer and the compiler.
Typealiases in (possibly constrained) protocol extensions, e.g.,

  extension RandomAccessCollection where Index : Strideable, Index.Stride == 
IndexDistance {
    typealias RandomAccessCollection.Indices = CountableRange<Index>

I’m intentionally using some odd ‘.’ syntax here to indicate that this 
typealias is intended only to be found when trying to satisfy an associated 
type requirement, and is not a general typealias that could be found by normal 
name lookup. Let’s set the syntax bike shed aside for the moment. The primary 
advantage of this approach (vs. inferring Indices from “var Indices: 
CountableRange<Index>” in a constrained protocol extension) is that there’s a 
real typealias declaration that compiler and programmer alike can look at and 
reason about based just on the name “Indices”. 

Note that this mechanism technically obviates the need for (1), in the same 
sense that default implementations in protocols 
 are merely syntactic sugar.
Declarations within the nominal type declaration or extension that declares 
conformance to the protocol in question. This is generally the same approach as 
described in “associated type inference” above, where we match requirements of 
the protocol against declarations that could satisfy those requirements and 
infer associated types from there. However, I want to turn it around: instead 
of starting with the requirements of the protocol any looking basically 
anywhere in the type or any protocol to which it conforms (for implementations 
in protocol extensions), start with the declarations that the user explicitly 
wrote at the point of the conformance and look for requirements they might 
satisfy. For example, consider our initial example:

  extension MyCollection: RandomAccessCollection {    
    var startIndex: Int { return contents.startIndex }
    var endIndex: Int { return contents.endIndex }
    subscript(index: Int) -> T { return contents[index] }

Since startIndex, endIndex, and subscript(_:) are declared in the same 
extension that declares conformance to RandomAccessIterator, we should look for 
requirements with the same name as these properties and subscript within 
RandomAccessCollection (or any protocol it inherits) and infer Index := Int and 
Element := T by matching the type signatures. This is still the most magical 
inference rule, because there is no declaration named “Index” or “Element” to 
look at. However, it is much narrower in scope than the current implementation, 
because it’s only going to reason from the (probably small) set of declarations 
that the user wrote alongside the conformance, so it’s more likely to be 
intentional. Note that this is again nudging programmers toward the style of 
programming where one puts one protocol conformance per extension, which is 
admittedly my personal preference.

I think this approach is more predictable and more implementable than the 
current model. I’m curious whether the above makes sense to someone other than 
me, and whether it covers existing use cases well enough. Thoughts?

        - Doug

swift-evolution mailing list

Reply via email to