Trying my hardest to summarize the talking points of the conversation so far.
Please let me know if I’ve missed any particular points:
- The thread was started (by yours truly) as an offshoot of another on-list
conversation involving Dave Abrahams. In it, I voiced a concern that there were
not enough legitimate needs for a single pass sequence as a core type
(supporting concepts such as for..in) to warrant the possibility of single-pass
behavior in Sequence when writing dependent algorithms
- There was a concern about the possible mutability of the methods
causing complexities around standard naming, in particular the recurring effort
to rename map/filter/reduce ‘terms of art’.
- For example in the case of filter vs filtered, the immutably-named
version might still destroy the ability to access the original data
- Dave Abrahams’s primary concern about the existing system is that the
restriction that Sequence is single-pass is not intuitive, and that generalized
functions over sequences would be tested with the commonly-available Sequences,
all of which are multi-pass.
- Considering single vs multi-pass can affect the algorithmic
complexity and memory requirements of their code, this could be a considerably
difficult bug to solve later on.
- There is also a desire to simplify things by reducing the # of types
if at all possible, and likewise a distaste for increasing the number of types
or overall complexity
- One example here would be that if single-pass sequences are rare, it
may be possible to drop Sequence completely and just have Collection
- Single-pass sequences seem to be very rare and centered around I/O. One
interesting example given was a database row cursor.
- A concern was voiced (by yours truly) that Sequences, unlike Collections, can
today be:
- Single pass/destructive
- Infinite
- Non-indexed
and that the elimination of single pass behavior would not eliminate the other
differentiating factors
It was countered that
- Infinite sequences may be representable with Collection
- It is possible to have the state in a programmatic generator be
encoded in or as the Index in a collection, so even programmatic sequences can
support indexes
- A single-pass sequence is likely best represented by not having implement
Sequence, but rather having an Iterator implementation directly.
- Such an Iterator should be a reference type rather than representing
it as having value semantics - if you actually support value semantics, then
your state should be self-contained and thus you can support multi-pass.
- IteratorProtocol might gain several of the sequence methods, such as
dropFirst() and prefix().
- Scala’s name for a single-pass sequence is TraversableOnce, and such
descriptive naming would be useful in developer understanding on how to use
single-pass sequences
- There was concern that the current model might be appropriate, and
that the changes may be underestimating developers’ ability to deal with the
complexity of single and multi pass sequences represented by the same type.
- It is a possibility that for..in would work with two distinct protocols (such
as IteratorProtocol and Sequence) , rather than having single pass iteration be
differentiated by where you are in an inheritance chain
-There was also a concern that if we have a single root type that is
for..in-able, we would [just be shuffling things around], e.g. calling Sequence
multi-pass, and creating a new SinglePassSequence above it.
- While single-vs-multipass is a distinction which needs to be made, there is
some disagreement over whether it makes sense to differentiate infinite vs
finite sequences. A collection with UInt64.max elements is finite but is still
effectively infinite in terms of computation time. E.g., algorithms might fail
due to memory consumption or performance for a 2 billion entry collection the
same as they would fail for an infinite one.
- There is a desire if possible to eliminate Sequence if possible, and just
have Collection. However, Collection has the requirement of returning a finite
count and is Indexable.
- Indexable requires a Index which can be used to look up a collection
value as an O(1) operation
- Indexable is comparable/equatable, although there is a possibility
that Comparable could be dropped to simplify implementation of Collections
- Indexable requires a end index, which for an infinite Collection
would be a synthetic value
- Indexable incudes a distance between two indexes, thus having similar
problems to Collection.count
- Around infinite sequences, there has been some discussion on whether it is
worth differentiating them via different types to provide developer warning
that they may be working with an infinite loop. There was discussion of a
for..in..until syntax for this (I presume because for..in..while would be
harder grammar-wise?). The justification is that requiring until requires the
developer to reason about the infinite sequence.
- For computed sequences, it would be possible to use an iterator directly as
the index, as the index is not required to be an Integer type to meet
Collection requirements.
- There was a discussion of having a Iterator and FiniteIterator. It was also
mentioned that Iterator could be func next() ->T while FiniteIterator is func
next() -> T? . There was also discussion that there might be a
PossiblyInfiniteIterator with Finite and Infinite iterators as subclasses.
- There was discussion about forcing value-semantic iterators to be
Collections. This could be done by allowing for infinite Collections, using
something similar to a value-semantic iterator as the Index, while reducing the
requirements for Index to support such usage (e.g. drop Comparable)
- There was a strawman to move eager operations (map, etc) and count to a
FiniteCollection sub-protocol of Collection. (note: it was not clarified what
other protocols Collection maintains, like Indexable, where Indexable still has
requirements for functions like distance() )
- "AFAIK there is no interesting multipass Sequence that cannot reasonably be
made to support indexing.” -Dave Abrahams
- Discussion steered toward using sequence(first:next:) and
sequence(state:next:) to create collections, and using this as a replacement
for iterator/sequence usage
- If Iterator and Sequence are no longer part of multi-pass sequences, one
could instead support for..in using formIndex and subscripting.
- This would support subscript setters, and thus mutation of values in
a for..in
- Dave Abrahams indicated a desire to support “partially formed” multipass
sequences which cannot meet indexable requirements easy via the existing
iterator system. An example given was collections imported from Foundation,
such as NSSet.
- Thus it would be likely that Collection would retain makeIterator.
Thus iterators themselves would not give a destructive vs nondestructive
guarantee.
- Three possible approaches were given for differentiating finite and infinite
single-pass sequences, to protect against generalized algorithms that would go
into an infinite loop when called:
1. Separate protocols (provide count only on FiniteIterator)
2. Implicit additional requirements from iterator source documentation
(don’t call count on the /dev/urandom-backed iterator)
3. Make all provided iterator methods lazy (instead of count, provide
underestimateCount or possibly have count return an optional)
-DW
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution