[swift-evolution] [Pitch] Remove destructive consumption from Sequence

David Waite via swift-evolution Wed, 22 Jun 2016 11:38:27 -0700

Today, a Sequence differs from a Collection in that:

- A sequence can be infinitely or indefinitely sized, or could require an O(n) 
operation to count the values in the sequence. A collection has a finite number 
of elements, and the fixed size is exposed as an O(1) or O(n) operation via 
‘count’

- A collection is indexable, with those indices being usable for various
operations including forming subsets, comparisons, and manual iteration

- A sequence may or may not be destructive, where a destructive sequence
consumes elements during traversal, making them unavailable on subsequent
traversals. Collection operations are required to be non-destructive

I would like to Pitch removing this third differentiation, the option for
destructive sequences.

My main motivation for proposing this is the potential for developer confusion.
As stated during one of the previous threads on the naming of map, flatMap,
filter, etc. methods on Sequence, Sequence has a naming requirement not typical
of the rest of the Swift standard library in that many methods on Sequence may
or may not be destructive. As such, naming methods for any extensions on
Sequence is challenging as the names need to not imply immutability.

It would still be possible to have Generators which operate destructively, but
such Generators would not conform to the needs of Sequence. As such, the most
significant impact would be the inability to use such Generators in a for..in
loop, although one could make the case for a lower-level Iterable-style
interface for requesting destructive-or-nondestructive generators for the
purpose of generic algorithms which do not care whether the data given is
consumed as part of them doing their work. I do not make this case here, as
instead I plan to make the case that destructive generators would be a rare
beast.

>From the Swift project documentation at
>https://github.com/apple/swift/blob/d95921e5a838d7cc450f6fbc2072bd1a5be95e24/docs/SequencesAndCollections.rst#sequences

"Because this construct is generic, s could be

• an array
• a set
• a linked list
• a series of UI events
• a file on disk
• a stream of incoming network packets
• an infinite series of random numbers
• a user-defined data structure
• etc.”

The disruption of such a change on this list of examples would likely be:

- The series of UI events from a UI framework likely indicates a queue, where
iterating over a generator would consume items from the head of that queue.

However, this is not typically how events are handled in such systems -
instead, events are often represented either via an event loop dispatch,
registered calls made by a thread pool, or a reactive mechanism. Such a stream
of incoming UI events would likely be blocking, as such a signaling method for
new events would still be needed at the queue level. When you consider UI
events are already usually serialized onto a single thread, using a queue at
the application level is an extra complexity over the event queue that is
already being used at the runloop level or kernel level.

- A file on disk would likely be iterating as a series of UInt8 values,
Characters, String lines, or other pattern matches. If built at a low enough
level, such as on top of NSInputStream, this would also represent reading a
file from a URL.

In this case there are three example behaviors I’d like to call attention to:
1. The developer wants to operate on the file as a set of data, in which case
one would expect a Data value, String, or [String] to be created to represent
the scenarios above.
2. The developer wants to parse a structured format, in which case such
iteration behaviors would likely be insufficient
3. The developer wants to iterate on the input data as a stream, without
waiting for the data to fully arrive or retaining it in memory. I suspect there
is less overlap with this sort of developer and the developer who wants a
framework-provided String-per-line iteration.

- Streams of incoming network packets build on the two previous points:
Like UI events, a stream of incoming network packets may be better suited to an
event dispatch or reactive mechanism. Like file access, a stream of incoming
network packets is likely to require processing beyond what is easily
obtainable with a for..in loop.

Likewise, it is possible for data to be segmented across network packets at the
application level, making a for..in loop possibly have to leak the contents of
previous packets to process a single framed network message. It is also more
likely that a network connectivity issue would disrupt the packets, requiring
either additional error recovery processes to be defined around such a for..in
loop, or an interface similar to reactive observables where stream close and
errors can be represented as part of the iterated state

- It is unlikely that a for..in loop would be over a random number sequence.

However, if you did want to use random number sequences in a for..in loop, a
‘random' number sequence is reproducible if it has the same seed.

Non-repeating behavior does not require consuming bytes except where the
‘random’ sequence needs to be reproducible as a whole, such as gaming and
cryptography applications. New ‘random’ data can be had by simply iterating a
new random number generator with a new random seed.

However, iterating over an external random source like /dev/random would no
longer be allowed via a for..in loop, because multiple iterations would yield
different data.

-DW

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

[swift-evolution] [Pitch] Remove destructive consumption from Sequence

Reply via email to