Re: [swift-evolution] Strings in Swift 4

Jonathan Hull via swift-evolution Sun, 05 Feb 2017 02:22:55 -0800

>>>> Just out of curiosity, what are the use-cases for an infinite sequence
>>>> (as opposed to a sequence which is bounded to the type’s representable
>>>> values)?
>>> 
>>> 1. The type may not have an inherent expressible bound (see BigInt,
>>>  UnsafePointer, and *many* real-life Index types).
>> 
>> If I understand what you are saying, this is why I was arguing that
>> these partial ranges should not, by themselves, conform to Sequence.
>> In cases where you do have a natural expressible bound, then it should
>> conditionally conform to sequence.  
> 
> I'm sorry, I don't understand that last sentence.  The only way to
> conditionally conform in one particular case is to make the conformance
> condition include detection of that case. I could guess at what you mean
> but I'd rather you explain yourself.  Specifically, which partial ranges
> should conform to Sequence, and which shouldn't, in your view?


Oh, sorry, I was referring to an earlier post I made in this thread, but didn’t 
make that clear.  I’ll give a longer explanation of my reasoning behind the 
statement that partial ranges shouldn’t conform to Sequence (on their own) 
below... but the short version is that the conformance, like cake, is a lie... 
and we are using the trap to lie to ourselves about it.


>> In other cases, you should have to write that conformance yourself if
>> you want it.
> 
> You've asserted that things should be a certain way (which I don't fully
> understand but I hope you'll explain), but you haven't said anything to
> clarify *why* you think they should be that way.

I believe that the ill-defineness/fuzziness of this issue that others have 
eluded to and the desire to trap is telling us that we are fighting the 
underlying structure of the types.  I get that trapping  can save us from an 
entire class of nasty bugs, but I also believe that they should be a last 
resort in design because they have a tendency to let us hide subtle design 
mistakes from ourselves.

If we continue down this path, we will find more and more edge cases that need 
to be papered over, until the design eventually needs to be scrapped or 
buttressed with compiler magic.  For example, let’s say we define a 
PartialRange as a simple struct with optional endpoints, then we conform it to 
Sequence.  Now we can say ’n…’ and get a sequence.  But what happens when we 
say ‘…n’?  Where does that sequence start if it has an implicit start point of 
negative infinity?  Well, we promised a Sequence, so our only real option is to 
trap (for some types anyway).  For UInt, ‘…n’ is a perfectly valid sequence 
(though we could trap on that as well for consistency sake and say that you 
have to say '0…n’).  We are using trapping to let us ignore the fact that we 
are promising something (conformance to Sequence) that isn’t really true...

In this case, I believe we are trying to shoehorn a number of different fuzzily 
defined requirements that almost, but don’t quite fit together:

1) RangeExpressions should work where Ranges work now
2) We want to be able to slice a collection using ’n…’ and ‘…n’, and the 
implicit boundary should be filled in by the collection
3) We want to be able to create a sequence from ’n…’ that can be zipped and 
iterated over

Looking at our requirements above, let’s start with #1.  Ranges (which are 
countable) conform to Collection.  We keep saying we need these to conform to 
Sequence (because of requirement 3), but I think what we really want/mean in a 
lot of cases is conformance to Collection.  The problem is that we don’t quite 
have enough information available to do that, so PartialRanges cannot be used 
in generic functions on Collection.  This is exposing a weakness of our 
Sequence/Collection protocols around infinite Sequences/Collections that we 
talked about a few months back, but ran out of time to solve (and I didn’t have 
bandwidth at the time to work on it).

As others have pointed out, 2 & 3 are similar, but have slightly different 
semantics.  In the case of 2, we have an implicit boundary being filled in: 
’n…’ means a range from ’n’ up to that boundary. We have explicitly stated we 
don’t want it to trap here, because that would make it useless.  In the case of 
3, we are giving it a different meaning: ’n…’ means starting with n, count 
higher until you run out of numbers… then, crash.  Here you are explicitly 
asking for trapping behavior.  These appear very similar at first glance, but 
need to be thought about subtly differently. For example, we have to be 
extremely careful using #3 with functional map/filter/reduce, and things like 
‘count’ would instantly crash (how do you count an infinity?).  We need to be 
really careful with #3 in general, because it breaks so many of the implicit 
guarantees around Collection (thus fulfilling #3, stops us from fulfilling #1). 
As I showed above, it also breaks the guarantees for Sequence when the starting 
bound is missing.


What I am suggesting is a small shift in the design, which lets us meet all 3 
of the above design goals by aligning them to have the same semantics (those of 
requirement 2).

That is, partial ranges are just partially defined ranges, and they don’t give 
us a sequence or collection by themselves.  In fact, they don’t even work as a 
Range by themselves.  To work as a RangeExpression, Sequence, or Collection, we 
need to be able to provide that implicit boundary and make it explicit.  In the 
case of slicing collections, the collection provides that implicit boundary.  
In the case of types, we would have a protocol that defines the 
implicit/natural boundaries for that type. Then we can conditionally conform to 
Collection based on that protocol. Types which are unable to conform to the 
protocol (like BigInt) would not gain the conditional conformance to 
Collection/Sequence, and would have to do it themselves, if desired.

For clarity, in my mind, PartialRange is a struct which has optional bounds and 
conforms to the RangeExpression protocol. RangeExpression is a protocol which 
defines a function that returns an actual Range given a set of concrete bounds 
to replace any missing ones.  In the case of slicing a collection, the 
collection is calling that function with its own bounds to get the resulting 
range. In the case of ‘for i in 0…’, the protocol defining the implicit/natural 
bounds gives us the bounds for that function, and the resulting range is used.  
Ranges themselves conform to RangeExpression by just returning self.

Now, things like ‘…n’ work just fine.  We could even have ‘…’ mean “the whole 
thing” if we decide we want that.  We can conform to collection in most cases.

For things like BigInt, conformance to Sequence would have to be provided 
explicitly (if desired)… but I view that as a good thing.  When you go to add 
that conformance, you again run into issues like: what does ‘…n’ do?  There are 
different answers/hacks (maybe you trap, maybe you just start at zero or 
something), each with tradeoffs. It is good that those tradeoffs are limited to 
the edge cases, as opposed to spread across all RangeExpressions (breaking the 
ability to conform to collection, for example).  My preference would be not to 
provide Sequence conformance for PartialRanges of BigInt at all, forcing the 
user to think about and provide explicit bounds.



>>> 2. I keep repeating variants of this example:
>>> 
>>> func listElements<
>>>   S: Sequence, N: Number
>>>> (of s: S, numberedFrom start: N) {
>>>   for (n, e) in zip(start..., s) {
>>>     print("\(n). \(e)")
>>>   }
>>> }
>>> 
>>> which avoids incorrect behavior when N turns out to be a type that
>>> can't represent values high enough to list everything in s—**if and
>>> only if** `start...` is an unbounded range, rather than one that
>>> implicitly gets its upper bound from its type.
>> 
>> I really worry about the possibility of long-term foot-gunning in this
>> case.  I showed this to a friend who maintains old/abandoned codebases
>> for a living, and he said “Oh god, that isn’t going to actually crash
>> until 5 years in, when it gets passed some rare type of file… and by
>> then the original programmer will have left.”  
> 
> Then either he's using the wrong language, or he needs to come to grips
> with the realities of software that's both reliable and efficient.
> Swift explicitly acknowledges that in efficient code there are inherent
> representational limitations that force a choice between stopping the
> program and continuing with incorrect behavior.  Swift explicitly
> chooses to stop the program in these cases.  Practically speaking, every
> program written in Swift is full of checks for these conditions, and
> this is nothing new.

Haha, he makes a bunch of money because of other people’s poor choice of 
programming languages/frameworks.  But as a result, he has seen a lot of old 
code and he has a good sense of what will break years down the line, and what 
is difficult to refactor...

As I said above, my main worry is that we are fooling ourselves by providing 
the trap, and actually creating more likelihood of (subtle) issues/bugs in the 
long run.

>> He hit on exactly what I had been feeling as well (but hadn’t been
>> able to put into words until I heard his).  The thought is that this
>> will help the programmer find an error by trapping, 
> 
> It will also prevent a program from wandering off into
> unpredictable/broken behavior in the hands of a user.  If your software
> keeps running but writes a corrupted file, that's usually much worse
> than a hard stop.

You forget that my version also has mistake prevention built in… at compile 
time, instead of runtime.    


>> but because it is dependent on the interactions of two subsystems, it
>> will often create one of those crashes where the stars have to align
>> for it to happen (which are my least favorite type of bug/crash to
>> debug).
> 
> You could look at it that way, but even in a programming language such
> as Python, whose integers are effectively all BigInts, a similar
> “stars-have-to-align” scenario occurs when you run out of memory.  On
> modern opeerating systems that effectively means your whole system
> grinds to a halt. It is no better than a crash, and often leads to a
> crash (sorry, you have no stack space left with which to make this
> function call!)  These things are a fact of life.

True, but that doesn’t mean we should add more stars-have-to-align cases.


>> I think this example also shows how my suggestion of requiring an
>> extra protocol for conformance to Sequence would actually be much more
>> effective in preventing the programmer error…
>> 
>> You would not have been able to write the function the way you did,
>> because writing ‘start…’ as a sequence would have required the
>> addition of ‘where N:FiniteComparable’ (or whatever we call the
>> protocol providing natural bounds).  In having to write that N needs
>> bounds, you are much more likely to be thinking of what those bounds
>> might be.  
> 
> I explicitly *don't want* N to need to have expressible bounds.  When we
> get a BigInt type, it will be especially effective in these scenarios.

I still maintain that the issue here is with the design of the function (hiding 
the importance of the type’s bounds) combined with the problems around infinite 
sequences. See my thoughts on BigInt sequences above…

It isn’t as pretty, but this version shouldn’t have the landmine (caveat: 
written in Mail):

        func listElements<S: Sequence, N:Number>(of s:S, numberedFrom start:N) {
                var bigN = BigInt(start)
                for item in s {
                        print(“\(bigN). \(item)”
                        bigN += 1
                }
        }

I suspect that running into the issue where you had to add ’N:FiniteComparable’ 
and then not wanting to specify the bounds would have led you to write a less 
pretty, but more robust version like the one above.

Thanks,
Jon

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Re: [swift-evolution] Strings in Swift 4

Reply via email to