Re: [swift-evolution] Strings in Swift 4

2017-02-27 Thread Ted F.A. van Gaalen via swift-evolution
Hi David W. 
please read inline responses
> On 25 Feb 2017, at 07:26, David Waite  wrote:
> 
> Ted, 
> 
> It might have helped if instead of being called String and Character, they 
> were named Text and ExtendedGraphemeCluster. 
Imho,l “text” maybe, but in computer programming “String” is more appropriate, 
I think. see:
https://en.wikipedia.org/wiki/String_(computer_science) 
 

Also imho, “Character” is OK (but maybe “Symbol” would be better)  because 
mostly, when working
with text/strings in an application it is not important to know how the 
Character is encoded, 
e.g. Unicode, ASCII, whatever.(OOP , please hide the details, thank you)  

However, If I needed to work  with the character’s components directly, e.g. 
when I 
might want to influence the display of
the underlying graphical aspects, I always have access to the Characters’ 
properties 
and methods. Unicode codepoints, ASCII bytes.. whatever it contains... 
   
> 
> They don’t really have the same behavior or functionality as 
> string/characters in many other languages, especially older languages. This 
> is because in many languages, strings are not just text but also 
> random-accesss (possibly binary) data.
could be but that’s not my perception of a string. 
> 
> Thats not to say that there aren’t a ton of algorithms where you can use Text 
> like a String, treat ExtendedGraphemeCluster like a character, and get 
> unicode behavior without thinking about it.
> 
> But when it comes to random access and/or byte modification, you are better 
> off working with something closer to a traditional (byte) string interface.
> 
> Trying to wedge random access and byte modification into the Swift String 
> will simply complicate everything, slow down the algorithms which don’t need 
> it, eat up more memory, as well as slow down bridging between Swift and 
> Objective C code.
Yes, this has been extensively discussed in this thread...
> 
> Hence me suggesting earlier working with Data, [UInt8], or [Character] within 
> the context of your manipulation code, then converting to a Swift String at 
> the end. Convert to the data format you need, then convert back.
That’s exactly what I did, saved that I have the desire to work exclusively 
with discrete
 (in the model of humanly visible discrete elements on a graphical medium) ... 

For the sake of completeness ,here is my complete Swift 3.x playground example, 
may useful for others too:
//: Playground - noun: a place with Character!

import UIKit
import Foundation

struct TGString: CustomStringConvertible
{
var ar = [Character]()

var description: String // used by "print" and "\(...)"
{
return String(ar)
}

// Construct from a String
init(_ str : String)
{
ar = Array(str.characters)
}
// Construct from a Character array
init(_ tgs : [Character])
{
ar = tgs
}
// Construct from anything. well sort of..
init(_ whatever : Any)
{
ar = Array("\(whatever)".characters)
}

var $: String
{
get   // return as a normal Swift String
{
return String(ar)
}
set (str) //Mutable: set from a Swift String
{
ar = Array(str.characters)
}
}

var asString: String
{
get   // return as a normal Swift String
{
return String(ar)
}
set (str) //Mutable: set from a Swift String
{
ar = Array(str.characters)
}
}

// Return the count of total number of characters:
var count: Int
{
get
{
return ar.count
}
}

// Return empty status:

var isEmpty: Bool
{
get
{
return ar.isEmpty
}
}

// s[n1.. TGString
{
get
{
return TGString( [ar[n]] )
}
set(newValue)
{
if newValue.isEmpty
{
ar.remove(at: n) // remove element when empty
}
else
{
ar[n] =  newValue.ar[0]
if newValue.count > 1
{
insert(at: n, string: newValue[1.. TGString
{
get
{
return TGString( Array(ar[r]) )
}
set(newValue)
{
ar[r] = ArraySlice(newValue.ar)
}
}

subscript (r: ClosedRange) -> TGString
{
get
{
return TGString( Array(ar[r]) )
}
set(newValue)
{
ar[r] = ArraySlice(newValue.ar)
}
}


Re: [swift-evolution] Strings in Swift 4

2017-02-25 Thread David Waite via swift-evolution

> On Feb 25, 2017, at 2:54 PM, Michael Ilseman  wrote:
> 
> 
>> On Feb 25, 2017, at 3:26 PM, David Waite via swift-evolution 
>> > wrote:
>> 
>> Ted, 
>> 
>> It might have helped if instead of being called String and Character, they 
>> were named Text
> 
> I would oppose taking a good name like “Text” and using it for Strings which 
> are mostly for machine processing purposes, but can be human-presentable with 
> explicit locale. A name like Text would a better fit for Strings bundled with 
> locale etc. for the purpose of presentation to humans, which must always be 
> in the context of some locale (even if a “default” system locale). Refer to 
> the sections in the String manifesto[1][2]. Such a Text type is definitely 
> out-of-scope for current discussion.
> 
Oh, I would never propose such a naming change, because I am comfortable with 
the existing names. I’m just acknowledging that the history of string 
manipulation causes friction in developers coming from other languages, in that 
they may expect certain functionality which doesn’t make sense within String’s 
goals.

I was merely illustrating that there is a big difference to how strings work in 
traditional languages and how a truly unicode-safe strings work. In scripting 
languages like ruby and python, string bears the brunt of binary data handling. 
Even in languages like Java and C#, unicode support takes compromises that 
Swift seems unwilling to make.

IMO, that Swift String doesn’t have random access capabilities is not a 
deficiency in Swift, but can cause misunderstandings of how Swift strings 
differ from other languages.

>> and ExtendedGraphemeCluster. 
>> 
> 
> What is expressed by Swift’s Character type is what the Unicode standard 
> often refers to as a “user-perceived character”. Note that “character” by it 
> self is not meaningful in Unicode (though it is often thrown about casually). 
> In Swift, Character is an appropriate name here for the concept of a 
> user-perceived character. If you want bytes, then you can use UInt8. If you 
> want Unicode scalar values, you can use UnicodeScalar. If you want code 
> units, you can use whatever that ends up looking (probably an associated type 
> named CodeUnit that is bound to UInt8 or UInt16 depending on the encoding).

A character “char" in C or C++ is considered nearly universally to be an 8-bit 
value. A Character in Java or Char in C# is a 16 bit (UTF-16) value. All of 
these effectively behave as integer values (with Character in java having the 
unique quality of being unsigned).

IMO, that Swift Character doesn’t behave as an integer value but rather closer 
to a string holding one user-perceived character is not a deficiency in Swift, 
but can cause misunderstandings because of how Swift differs from other 
languages.

-DW

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-25 Thread Michael Ilseman via swift-evolution

> On Feb 25, 2017, at 3:26 PM, David Waite via swift-evolution 
>  wrote:
> 
> Ted, 
> 
> It might have helped if instead of being called String and Character, they 
> were named Text

I would oppose taking a good name like “Text” and using it for Strings which 
are mostly for machine processing purposes, but can be human-presentable with 
explicit locale. A name like Text would a better fit for Strings bundled with 
locale etc. for the purpose of presentation to humans, which must always be in 
the context of some locale (even if a “default” system locale). Refer to the 
sections in the String manifesto[1][2]. Such a Text type is definitely 
out-of-scope for current discussion.

[1] 
https://github.com/apple/swift/blob/master/docs/StringManifesto.md#the-default-behavior-of-string
 
[2]
 
https://github.com/apple/swift/blob/master/docs/StringManifesto.md#future-directions

> and ExtendedGraphemeCluster. 
> 

What is expressed by Swift’s Character type is what the Unicode standard often 
refers to as a “user-perceived character”. Note that “character” by it self is 
not meaningful in Unicode (though it is often thrown about casually). In Swift, 
Character is an appropriate name here for the concept of a user-perceived 
character. If you want bytes, then you can use UInt8. If you want Unicode 
scalar values, you can use UnicodeScalar. If you want code units, you can use 
whatever that ends up looking (probably an associated type named CodeUnit that 
is bound to UInt8 or UInt16 depending on the encoding).


> They don’t really have the same behavior or functionality as 
> string/characters in many other languages, especially older languages. This 
> is because in many languages, strings are not just text but also 
> random-accesss (possibly binary) data.
> 
> Thats not to say that there aren’t a ton of algorithms where you can use Text 
> like a String, treat ExtendedGraphemeCluster like a character, and get 
> unicode behavior without thinking about it.
> 
> But when it comes to random access and/or byte modification, you are better 
> off working with something closer to a traditional (byte) string interface.
> 
> Trying to wedge random access and byte modification into the Swift String 
> will simply complicate everything, slow down the algorithms which don’t need 
> it, eat up more memory, as well as slow down bridging between Swift and 
> Objective C code.
> 
> Hence me suggesting earlier working with Data, [UInt8], or [Character] within 
> the context of your manipulation code, then converting to a Swift String at 
> the end. Convert to the data format you need, then convert back.
> 
> Thats not to say that there aren’t features which would simplify/clarify 
> algorithms working in this manner.
> 
> -DW
> 
>> On Feb 24, 2017, at 4:27 PM, Ted F.A. van Gaalen via swift-evolution 
>> > wrote:
>> 
>> ok, I understand, thank you
>> TedvG
>>> On 25 Feb 2017, at 00:25, David Sweeris >> > wrote:
>>> 
>>> 
 On Feb 24, 2017, at 13:41, Ted F.A. van Gaalen > wrote:
 
 Hi David & Dave
 
 can you explain that in more detail?
>> Wouldn’t that turn simple character access into a mutating function?
 
 assigning like   s[11…14] = str  is of course, yes.
 only then - that is if the character array thus has been changed - 
 it has to update the string in storage, yes. 
 
 but  str = s[n..>> 
>>> It mutates because the String has to instantiate the Array to 
>>> which you're indexing into, if it doesn't already exist. It may not make 
>>> any externally visible changes, but it's still a change.
>>> 
>>> - Dave Sweeris
>> 
>> ___
>> swift-evolution mailing list
>> swift-evolution@swift.org 
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread David Waite via swift-evolution
Ted, 

It might have helped if instead of being called String and Character, they were 
named Text and ExtendedGraphemeCluster. 

They don’t really have the same behavior or functionality as string/characters 
in many other languages, especially older languages. This is because in many 
languages, strings are not just text but also random-accesss (possibly binary) 
data.

Thats not to say that there aren’t a ton of algorithms where you can use Text 
like a String, treat ExtendedGraphemeCluster like a character, and get unicode 
behavior without thinking about it.

But when it comes to random access and/or byte modification, you are better off 
working with something closer to a traditional (byte) string interface.

Trying to wedge random access and byte modification into the Swift String will 
simply complicate everything, slow down the algorithms which don’t need it, eat 
up more memory, as well as slow down bridging between Swift and Objective C 
code.

Hence me suggesting earlier working with Data, [UInt8], or [Character] within 
the context of your manipulation code, then converting to a Swift String at the 
end. Convert to the data format you need, then convert back.

Thats not to say that there aren’t features which would simplify/clarify 
algorithms working in this manner.

-DW

> On Feb 24, 2017, at 4:27 PM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> ok, I understand, thank you
> TedvG
>> On 25 Feb 2017, at 00:25, David Sweeris > > wrote:
>> 
>> 
>>> On Feb 24, 2017, at 13:41, Ted F.A. van Gaalen >> > wrote:
>>> 
>>> Hi David & Dave
>>> 
>>> can you explain that in more detail?
> Wouldn’t that turn simple character access into a mutating function?
>>> 
>>> assigning like   s[11…14] = str  is of course, yes.
>>> only then - that is if the character array thus has been changed - 
>>> it has to update the string in storage, yes. 
>>> 
>>> but  str = s[n..>> so you’d have to maintain keep (private) a isChanged: Bool or bit.
>>> a checksum over the character array .  
>>> ?
>> 
>> It mutates because the String has to instantiate the Array to 
>> which you're indexing into, if it doesn't already exist. It may not make any 
>> externally visible changes, but it's still a change.
>> 
>> - Dave Sweeris
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Ted F.A. van Gaalen via swift-evolution
ok, I understand, thank you
TedvG
> On 25 Feb 2017, at 00:25, David Sweeris  wrote:
> 
> 
>> On Feb 24, 2017, at 13:41, Ted F.A. van Gaalen  wrote:
>> 
>> Hi David & Dave
>> 
>> can you explain that in more detail?
 Wouldn’t that turn simple character access into a mutating function?
>> 
>> assigning like   s[11…14] = str  is of course, yes.
>> only then - that is if the character array thus has been changed - 
>> it has to update the string in storage, yes. 
>> 
>> but  str = s[n..> so you’d have to maintain keep (private) a isChanged: Bool or bit.
>> a checksum over the character array .  
>> ?
> 
> It mutates because the String has to instantiate the Array to 
> which you're indexing into, if it doesn't already exist. It may not make any 
> externally visible changes, but it's still a change.
> 
> - Dave Sweeris

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread David Sweeris via swift-evolution

> On Feb 24, 2017, at 13:41, Ted F.A. van Gaalen  wrote:
> 
> Hi David & Dave
> 
> can you explain that in more detail?
>>> Wouldn’t that turn simple character access into a mutating function?
> 
> assigning like   s[11…14] = str  is of course, yes.
> only then - that is if the character array thus has been changed - 
> it has to update the string in storage, yes. 
> 
> but  str = s[n.. so you’d have to maintain keep (private) a isChanged: Bool or bit.
> a checksum over the character array .  
> ?

It mutates because the String has to instantiate the Array to which 
you're indexing into, if it doesn't already exist. It may not make any 
externally visible changes, but it's still a change.

- Dave Sweeris 
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Ted F.A. van Gaalen via swift-evolution
Hi Dave
Thanks for your time to go in to this an explain.
This optimising goes much further then I thought.

> That's fine. Please don't be offended that I don't wish to argue it further. 
> It's been an interesting exercise while I'm on vacation and I hoped it would 
> lay out some general principles that would be useful to others in future even 
> if you are not convinced, but when I get back to work next week I'll have to 
> focus on other things.

yes, I understand, it would become too iterative and time consuming I guess. 
( how can you become work-detached if you keep doing things like this during 
vacation? )  

Enjoy your vacation!  
TedvG




> On 24 Feb 2017, at 22:40, Dave Abrahams  wrote:
> 
> 
> 
> Sent from my moss-covered three-handled family gradunza
> 
> On Feb 23, 2017, at 2:04 PM, Ted F.A. van Gaalen  > wrote:
> 
>> 
>>> On 23 Feb 2017, at 02:24, Dave Abrahams >> > wrote:
>>> 
>>> Equally a non-starter. All known threadsafe schemes that require caches to 
>>> be updated upon non-mutating operations have horrible performance issues, 
>>> and further this would penalize all string code by reserving space for the 
>>> cache and filling it even for the vast majority of operations that don't 
>>> require random access.
>> Well, maybe “caching” is not the right description for what I've suggested.
>> It is more like:
>>   let all strings be stored as they are now, but as soon as you want to work 
>> with 
>> random accessing parts of a string just “lift the string out of normal 
>> optimised string storage” 
>>  and then add (temporarily)  a Character array so one can work with this 
>> array directly ” 
> 
> That's a cache.
> 
>> which implies that all other strings remain as they are.  ergo: efficiency 
>> is only reduced for the “elevated” strings,
> 
> You have to add that temporary array somewhere.  The performance of every 
> string is penalized for that storage, and also for the cost of throwing it 
> out upon mutation. Every branch counts. 
> 
>> Using e.g. str.freeSpace(), if necessary, would then place the String back 
>> in its normal storage domain, thereby disposing the Character array
>> associated with it. 
> 
> Avoiding hidden dynamic storage overhead that needs to be freed is an 
> explicit goal of the design (see the section on String and Substring).
> 
>>> Trust me, we've gotten lots of such suggestions and thought through the 
>>> implications of each one very carefully.  
>> That’s good, because it means, that a lot of people are interested in this 
>> subject and wish to help.  
>> Of course you’ll get many of suggestions that might not be very useful, 
>> perhaps like this one... but sometimes suddenly someone 
>> comes along with things that might never have occurred to you. 
>> That is the beautiful nature of ideas…
> 
> But at some point, I hope you'll understand, I also have to say that I think 
> all the simple schemes have been adequately explored and the complex ones all 
> seem to have this basic property of relying on caches, which has unacceptable 
> performance, complexity, and, yes, usability costs. Analyzing and refuting 
> each one in detail begins to be a waste of time after that.  I'm not really 
> willing to go further down this road unless someone has an implementation and 
> experimental evidence that demonstrates it as non-problematic. 
> 
> 
>>> I'm afraid you will have to accept being disappointed about this. 
>> Well, like most developers, I am a stubborn kind of guy.. 
>> Luckily Swift is very flexible like Lego, so I rolled my own convenience 
>> struct.
>> If I need direct access on a string I simply copy the string to it.
>> it permits things like this:  (and growing) 
>> 
>> let strabc = "abcdefghjiklmnopqrstuvwxyz"
>> let strABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>> var abc = TGString(strabc)
>> var ABC = TGString(strABC)
>> 
>> func test()
>> {
>> // as in Basic: left$, mid$, right$
>> print(abc.left(5))
>> print(abc.mid(5,10))
>> print(ABC.mid(5))
>> print(ABC.right(5))
>> 
>> // ranges and concatenation:
>> print(abc[12..<23])
>> print(abc.left(5) + ABC.mid(6,6) + abc[10...25])
>> 
>> // eat anything:
>> let d:Double = -3.14159
>> print(TGString(d))
>>  
>> let n:Int = 1234
>> print(TGString(n))
>> 
>> print(TGString(1234.56789))
>> 
>> let str = abc[15..<17].asString  // Copy to to normal Swift String
>> print(str)
>> 
>> let s = "\(abc[12..<20])" // interpolate to normal Swift String.
>> print(s)
>> 
>> abc[3..<5] = TGString("34") // if lengths don't match:
>> abc[8...9] = ABC[24...25]   //  length of dest. string is altered.
>> abc[12] = TGString("")  //  if src l > 1 will insert remainder after 
>> dest.12 here
>> abc[14] = TGString("")  //  empty removes character at pos.
>> print(abc)
>>

Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Ted F.A. van Gaalen via swift-evolution
Hi David & Dave

can you explain that in more detail?
>> Wouldn’t that turn simple character access into a mutating function?

assigning like   s[11…14] = str  is of course, yes.
only then - that is if the character array thus has been changed - 
it has to update the string in storage, yes. 

but  str = s[n.. On 24 Feb 2017, at 22:40, Dave Abrahams  wrote:
> 
> 
> 
> Sent from my moss-covered three-handled family gradunza
> 
> On Feb 23, 2017, at 2:04 PM, Ted F.A. van Gaalen  > wrote:
> 
>> 
>>> On 23 Feb 2017, at 02:24, Dave Abrahams >> > wrote:
>>> 
>>> Equally a non-starter. All known threadsafe schemes that require caches to 
>>> be updated upon non-mutating operations have horrible performance issues, 
>>> and further this would penalize all string code by reserving space for the 
>>> cache and filling it even for the vast majority of operations that don't 
>>> require random access. 
>> Well, maybe “caching” is not the right description for what I've suggested.
>> It is more like:
>>   let all strings be stored as they are now, but as soon as you want to work 
>> with 
>> random accessing parts of a string just “lift the string out of normal 
>> optimised string storage” 
>>  and then add (temporarily)  a Character array so one can work with this 
>> array directly ” 
> 
> That's a cache.
> 
>> which implies that all other strings remain as they are.  ergo: efficiency 
>> is only reduced for the “elevated” strings,
> 
> You have to add that temporary array somewhere.  The performance of every 
> string is penalized for that storage, and also for the cost of throwing it 
> out upon mutation. Every branch counts. 
> 
>> Using e.g. str.freeSpace(), if necessary, would then place the String back 
>> in its normal storage domain, thereby disposing the Character array
>> associated with it. 
> 
> Avoiding hidden dynamic storage overhead that needs to be freed is an 
> explicit goal of the design (see the section on String and Substring).
> 
>>> Trust me, we've gotten lots of such suggestions and thought through the 
>>> implications of each one very carefully.  
>> That’s good, because it means, that a lot of people are interested in this 
>> subject and wish to help.  
>> Of course you’ll get many of suggestions that might not be very useful, 
>> perhaps like this one... but sometimes suddenly someone 
>> comes along with things that might never have occurred to you. 
>> That is the beautiful nature of ideas…
> 
> But at some point, I hope you'll understand, I also have to say that I think 
> all the simple schemes have been adequately explored and the complex ones all 
> seem to have this basic property of relying on caches, which has unacceptable 
> performance, complexity, and, yes, usability costs. Analyzing and refuting 
> each one in detail begins to be a waste of time after that.  I'm not really 
> willing to go further down this road unless someone has an implementation and 
> experimental evidence that demonstrates it as non-problematic. 
> 
> 
>>> I'm afraid you will have to accept being disappointed about this. 
>> Well, like most developers, I am a stubborn kind of guy.. 
>> Luckily Swift is very flexible like Lego, so I rolled my own convenience 
>> struct.
>> If I need direct access on a string I simply copy the string to it.
>> it permits things like this:  (and growing) 
>> 
>> let strabc = "abcdefghjiklmnopqrstuvwxyz"
>> let strABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
>> var abc = TGString(strabc)
>> var ABC = TGString(strABC)
>> 
>> func test()
>> {
>> // as in Basic: left$, mid$, right$
>> print(abc.left(5))
>> print(abc.mid(5,10))
>> print(ABC.mid(5))
>> print(ABC.right(5))
>> 
>> // ranges and concatenation:
>> print(abc[12..<23])
>> print(abc.left(5) + ABC.mid(6,6) + abc[10...25])
>> 
>> // eat anything:
>> let d:Double = -3.14159
>> print(TGString(d))
>>  
>> let n:Int = 1234
>> print(TGString(n))
>> 
>> print(TGString(1234.56789))
>> 
>> let str = abc[15..<17].asString  // Copy to to normal Swift String
>> print(str)
>> 
>> let s = "\(abc[12..<20])" // interpolate to normal Swift String.
>> print(s)
>> 
>> abc[3..<5] = TGString("34") // if lengths don't match:
>> abc[8...9] = ABC[24...25]   //  length of dest. string is altered.
>> abc[12] = TGString("")  //  if src l > 1 will insert remainder after 
>> dest.12 here
>> abc[14] = TGString("")  //  empty removes character at pos.
>> print(abc)
>> abc.insert(at: 3, string: ABC[0..<3])
>> print(abc)
>> }
>>  
>> test()
>> .
>> outputs: 
>> abcde
>> fghjiklmno
>> FGHIJKLMNOPQRSTUVWXYZ
>> VWXYZ
>> mnopqrstuvw
>> 

Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Dave Abrahams via swift-evolution


Sent from my moss-covered three-handled family gradunza

> On Feb 23, 2017, at 2:04 PM, Ted F.A. van Gaalen  
> wrote:
> 
> 
>> On 23 Feb 2017, at 02:24, Dave Abrahams  wrote:
>> 
>> Equally a non-starter. All known threadsafe schemes that require caches to 
>> be updated upon non-mutating operations have horrible performance issues, 
>> and further this would penalize all string code by reserving space for the 
>> cache and filling it even for the vast majority of operations that don't 
>> require random access.
> Well, maybe “caching” is not the right description for what I've suggested.
> It is more like:
>   let all strings be stored as they are now, but as soon as you want to work 
> with 
> random accessing parts of a string just “lift the string out of normal 
> optimised string storage” 
>  and then add (temporarily)  a Character array so one can work with this 
> array directly ” 

That's a cache.

> which implies that all other strings remain as they are.  ergo: efficiency 
> is only reduced for the “elevated” strings,

You have to add that temporary array somewhere.  The performance of every 
string is penalized for that storage, and also for the cost of throwing it out 
upon mutation. Every branch counts. 

> Using e.g. str.freeSpace(), if necessary, would then place the String back 
> in its normal storage domain, thereby disposing the Character array
> associated with it. 

Avoiding hidden dynamic storage overhead that needs to be freed is an explicit 
goal of the design (see the section on String and Substring).

>> Trust me, we've gotten lots of such suggestions and thought through the 
>> implications of each one very carefully.  
> That’s good, because it means, that a lot of people are interested in this 
> subject and wish to help.  
> Of course you’ll get many of suggestions that might not be very useful, 
> perhaps like this one... but sometimes suddenly someone 
> comes along with things that might never have occurred to you. 
> That is the beautiful nature of ideas…

But at some point, I hope you'll understand, I also have to say that I think 
all the simple schemes have been adequately explored and the complex ones all 
seem to have this basic property of relying on caches, which has unacceptable 
performance, complexity, and, yes, usability costs. Analyzing and refuting each 
one in detail begins to be a waste of time after that.  I'm not really willing 
to go further down this road unless someone has an implementation and 
experimental evidence that demonstrates it as non-problematic. 


>> I'm afraid you will have to accept being disappointed about this. 
> Well, like most developers, I am a stubborn kind of guy.. 
> Luckily Swift is very flexible like Lego, so I rolled my own convenience 
> struct.
> If I need direct access on a string I simply copy the string to it.
> it permits things like this:  (and growing) 
> 
> let strabc = "abcdefghjiklmnopqrstuvwxyz"
> let strABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> var abc = TGString(strabc)
> var ABC = TGString(strABC)
> 
> func test()
> {
> // as in Basic: left$, mid$, right$
> print(abc.left(5))
> print(abc.mid(5,10))
> print(ABC.mid(5))
> print(ABC.right(5))
> 
> // ranges and concatenation:
> print(abc[12..<23])
> print(abc.left(5) + ABC.mid(6,6) + abc[10...25])
> 
> // eat anything:
> let d:Double = -3.14159
> print(TGString(d))
>  
> let n:Int = 1234
> print(TGString(n))
> 
> print(TGString(1234.56789))
> 
> let str = abc[15..<17].asString  // Copy to to normal Swift String
> print(str)
> 
> let s = "\(abc[12..<20])" // interpolate to normal Swift String.
> print(s)
> 
> abc[3..<5] = TGString("34") // if lengths don't match:
> abc[8...9] = ABC[24...25]   //  length of dest. string is altered.
> abc[12] = TGString("")  //  if src l > 1 will insert remainder after 
> dest.12 here
> abc[14] = TGString("")  //  empty removes character at pos.
> print(abc)
> abc.insert(at: 3, string: ABC[0..<3])
> print(abc)
> }
>  
> test()
> .
> outputs: 
> abcde
> fghjiklmno
> FGHIJKLMNOPQRSTUVWXYZ
> VWXYZ
> mnopqrstuvw
> abcdeGHIJKLklmnopqrstuvwxyz
> -3.14159
> 1234
> 1234.56789
> abcdefghjiklmnopqrstuvwxyz
> mnopqrst
> abc34fghYZklnopqrstuvwxyz
> abcABC34fghYZklnopqrstuvwxyz
> 
> kinda hoped that this could be builtin in Swift strings 
> Anyway, I’ve made myself what I wanted, which happily co-exists
> alongside normal Swift strings.  Performance and storage
> aspects of my struct TGString are not very important, because
> I am not using this on thousands of strings.
> Simply want to use a string as a plain array, that’s all, 
> which is implemented in almost every PL on this planet. 
> 
> 
>> More generally, there's a reason that the collection model has bidirectional 
>> and random access distinctions: important data structures are inherently not 
>> random 

Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Dave Abrahams via swift-evolution
Exactly. 

Sent from my moss-covered three-handled family gradunza

> On Feb 24, 2017, at 9:49 AM, David Sweeris  wrote:
> 
> 
>>> On Feb 23, 2017, at 4:04 PM, Ted F.A. van Gaalen via swift-evolution 
>>>  wrote:
>>> 
>>> 
>> 
>>> On 23 Feb 2017, at 02:24, Dave Abrahams  wrote:
>>> 
>>> Equally a non-starter. All known threadsafe schemes that require caches to 
>>> be updated upon non-mutating operations have horrible performance issues, 
>>> and further this would penalize all string code by reserving space for the 
>>> cache and filling it even for the vast majority of operations that don't 
>>> require random access.
>> Well, maybe “caching” is not the right description for what I've suggested.
>> It is more like:
>>   let all strings be stored as they are now, but as soon as you want to work 
>> with 
>> random accessing parts of a string just “lift the string out of normal 
>> optimised string storage” 
>>  and then add (temporarily)  a Character array so one can work with this 
>> array directly ” 
>> which implies that all other strings remain as they are.  ergo: efficiency 
>> is only reduced for the “elevated” strings,
>> Using e.g. str.freeSpace(), if necessary, would then place the String back 
>> in its normal storage domain, thereby disposing the Character array
>> associated with it. 
> 
> Wouldn’t that turn simple character access into a mutating function?
> 
> - Dave Sweeris
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread David Sweeris via swift-evolution

> On Feb 23, 2017, at 4:04 PM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> 
>> On 23 Feb 2017, at 02:24, Dave Abrahams > > wrote:
>> 
>> Equally a non-starter. All known threadsafe schemes that require caches to 
>> be updated upon non-mutating operations have horrible performance issues, 
>> and further this would penalize all string code by reserving space for the 
>> cache and filling it even for the vast majority of operations that don't 
>> require random access.
> Well, maybe “caching” is not the right description for what I've suggested.
> It is more like:
>   let all strings be stored as they are now, but as soon as you want to work 
> with 
> random accessing parts of a string just “lift the string out of normal 
> optimised string storage” 
>  and then add (temporarily)  a Character array so one can work with this 
> array directly ” 
> which implies that all other strings remain as they are.  ergo: efficiency 
> is only reduced for the “elevated” strings,
> Using e.g. str.freeSpace(), if necessary, would then place the String back 
> in its normal storage domain, thereby disposing the Character array
> associated with it. 

Wouldn’t that turn simple character access into a mutating function?

- Dave Sweeris___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-24 Thread Ted F.A. van Gaalen via swift-evolution

> On 23 Feb 2017, at 02:24, Dave Abrahams  wrote:
> 
> Equally a non-starter. All known threadsafe schemes that require caches to be 
> updated upon non-mutating operations have horrible performance issues, and 
> further this would penalize all string code by reserving space for the cache 
> and filling it even for the vast majority of operations that don't require 
> random access.
Well, maybe “caching” is not the right description for what I've suggested.
It is more like:
  let all strings be stored as they are now, but as soon as you want to work 
with 
random accessing parts of a string just “lift the string out of normal 
optimised string storage” 
 and then add (temporarily)  a Character array so one can work with this array 
directly ” 
which implies that all other strings remain as they are.  ergo: efficiency 
is only reduced for the “elevated” strings,
Using e.g. str.freeSpace(), if necessary, would then place the String back 
in its normal storage domain, thereby disposing the Character array
associated with it. 
   

> Trust me, we've gotten lots of such suggestions and thought through the 
> implications of each one very carefully.  
That’s good, because it means, that a lot of people are interested in this 
subject and wish to help.  
Of course you’ll get many of suggestions that might not be very useful, 
perhaps like this one... but sometimes suddenly someone 
comes along with things that might never have occurred to you. 
That is the beautiful nature of ideas…

> I'm afraid you will have to accept being disappointed about this. 
Well, like most developers, I am a stubborn kind of guy.. 
Luckily Swift is very flexible like Lego, so I rolled my own convenience struct.
If I need direct access on a string I simply copy the string to it.
it permits things like this:  (and growing) 

let strabc = "abcdefghjiklmnopqrstuvwxyz"
let strABC = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
var abc = TGString(strabc)
var ABC = TGString(strABC)

func test()
{
// as in Basic: left$, mid$, right$
print(abc.left(5))
print(abc.mid(5,10))
print(ABC.mid(5))
print(ABC.right(5))

// ranges and concatenation:
print(abc[12..<23])
print(abc.left(5) + ABC.mid(6,6) + abc[10...25])

// eat anything:
let d:Double = -3.14159
print(TGString(d))
 
let n:Int = 1234
print(TGString(n))

print(TGString(1234.56789))

let str = abc[15..<17].asString  // Copy to to normal Swift String
print(str)

let s = "\(abc[12..<20])" // interpolate to normal Swift String.
print(s)

abc[3..<5] = TGString("34") // if lengths don't match:
abc[8...9] = ABC[24...25]   //  length of dest. string is altered.
abc[12] = TGString("")  //  if src l > 1 will insert remainder after 
dest.12 here
abc[14] = TGString("")  //  empty removes character at pos.
print(abc)
abc.insert(at: 3, string: ABC[0..<3])
print(abc)
}
 
test()
.
outputs: 
abcde
fghjiklmno
FGHIJKLMNOPQRSTUVWXYZ
VWXYZ
mnopqrstuvw
abcdeGHIJKLklmnopqrstuvwxyz
-3.14159
1234
1234.56789
abcdefghjiklmnopqrstuvwxyz
mnopqrst
abc34fghYZklnopqrstuvwxyz
abcABC34fghYZklnopqrstuvwxyz

kinda hoped that this could be builtin in Swift strings 
Anyway, I’ve made myself what I wanted, which happily co-exists
alongside normal Swift strings.  Performance and storage
aspects of my struct TGString are not very important, because
I am not using this on thousands of strings.
Simply want to use a string as a plain array, that’s all, 
which is implemented in almost every PL on this planet. 


> More generally, there's a reason that the collection model has bidirectional 
> and random access distinctions: important data structures are inherently not 
> random access.
I don’t understand the above line: definition of “important data structures” <> 
“inherently” 
> Heroic attempts to present the illusion that they are randomly-accessible are 
> not going to fly.
  ?? Accessing discrete elements directly in an array is not an illusion to me. 
(e.g. I took the 4th and 7th eggs from the container) 
> These abstractions always break down,  leaking the true non-random-access 
> nature in often unpredictable ways, penalizing lots of code for the sake of a 
> very few use-cases, and introducing complexity that is hard for the optimizer 
> to digest and makes it painful (sometimes impossible) to grow and evolve the 
> library.  
> 
Is an Array an abstraction? of what? I don’t get this either. most components 
in the real world can be accessed randomly. 

> This should be seen as a general design philosophy: Swift presents 
> abstractions that harmonize with, rather than hide, the true nature of things.
The true nature of things is a very vague and subjective criterium, how can you 
harmonise with that, let alone with abstractions? 
e.g. for me: “the true nature of things” for an array is that it has direct 
accessible discrete elements…

Sorry, with respect, we have a difference of 

Re: [swift-evolution] Strings in Swift 4

2017-02-22 Thread Dave Abrahams via swift-evolution
Equally a non-starter. All known threadsafe schemes that require caches to be 
updated upon non-mutating operations have horrible performance issues, and 
further this would penalize all string code by reserving space for the cache 
and filling it even for the vast majority of operations that don't require 
random access. Trust me, we've gotten lots of such suggestions and thought 
through the implications of each one very carefully.  I'm afraid you will have 
to accept being disappointed about this. 

More generally, there's a reason that the collection model has bidirectional 
and random access distinctions: important data structures are inherently not 
random access. Heroic attempts to present the illusion that they are 
randomly-accessible are not going to fly. These abstractions always break down, 
 leaking the true non-random-access nature in often unpredictable ways, 
penalizing lots of code for the sake of a very few use-cases, and introducing 
complexity that is hard for the optimizer to digest and makes it painful 
(sometimes impossible) to grow and evolve the library.  

This should be seen as a general design philosophy: Swift presents abstractions 
that harmonize with, rather than hide, the true nature of things.

>From me, the answer remains "no."

Sent from my moss-covered three-handled family gradunza

> On Feb 22, 2017, at 1:40 PM, Ted F.A. van Gaalen  
> wrote:
> 
> What about having a (lazy)  Array property inside String?
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-22 Thread Ted F.A. van Gaalen via swift-evolution
Thank you Michael,
 I did that already in this extension: (as written before) 
 
extension String
{
var count: Int
{
get
{
return self.characters.count
}
}

// properties in extensions not possible
// var ar =  Array(self.characters) 

subscript (n: Int) -> String
{
return String(Array(self.characters)[n])
}

subscript (r: Range) -> String
{
return String(Array(self.characters)[r])
}

subscript (r: ClosedRange) -> String
{
return String(Array(self.characters)[r])
}
}

but this is not so efficient, because for each subscript invocation
the Character array must be built again: ( If not cached within String) 
I assume, it must be reloaded each time because one cannot create create new
properties in extensions (why not?) like a Character Array as in the above 
comment
  

> On 22 Feb 2017, at 19:43, Michael Ilseman  wrote:
> 
> Given that the behavior you desire is literally a few key strokes away (see 
> below), it would be unfortunate to pessimize the internal representation of 
> Strings for every application. This would destroy the applicability of the 
> Swift standard library to entire areas of computing such as application 
> development for mobile devices (Swift's current largest niche). The idea of 
> abstraction is that you can provide a high-level view of things stored at a 
> lower-level in accordance with sensible higher-level semantics and 
> expectations. If you want random access, then you can eagerly project the 
> characters (see below). This is consistent with the standard library’s 
> preference for lazy sequences when providing a eager one would result in a 
> large up-front cost that might be avoidable otherwise.
> 
mostly true.
> Here’s playground code that gives you what you’re requesting, by doing an 
> eager projection (rather than a lazy one, which is the default):
> 
Your extension is more efficient than my subscript extension above, 
because the Character array is drawn once from the String, instead of that each
time the str.characters property is scanned again
@Dave :
 is that the case, or is the character view cached , so that it
doesn’t matter much if the characterView is retrieved frequently?  
> extension String {
> var characterArray: [Character] {
> return characters.map { $0 }
> }
> }
> let str = "abcdefg\(UnicodeScalar(0x302)!)"
> let charArray = str.characterArray
> charArray[4] // results in "e"
> charArray[6] // results in “ĝ”

I would normally subclass String, but in Swift I can’t do this
because String is a struct, inheritance of structs is not
possible in Swift. 

@Dave:
Thanks for the explanation and the link (it’s been a long time
ago reading about pointers, normally I try to avoid these things like the 
plague..)  

Factor 8?  that's a big storage difference.. Currently still diving into Swift 
stdlib, 
maybe I’ll get some bright ideas there , but don’t count on it :o)  

However, for the String struct, I have another suggestion/solution/question if 
I may: 

If  String’s CharacterView is not cached (or is it?) to prevent repetitive 
regeneration,
but even then: 

What about having a (lazy)  Array property inside String?
which: 
  is normally nil and only created when parts of aString are
  accessed/changede.g. with subscription.
  will be nil again when String has changed. 
can also be disposed of (to nil or emptied) upon request: 
  str.disposeCharacterArray() 
   or maybe:
  str.compactString()  
  str.freeSpace()  

Although then available as a property like this:
  str.characterArray , 
normally one would not access this character array directly,
but rather implicitly with subscripting on the String itself, like str[n…m]. 
In that case, if it does not already exist, this character array inside String 
will be created and remains alive until aString disappears , changes, or 
the string’s character array is explicitly disposed.
(e.g. useful when many strings are involved, to free storage) 

in that way:
No unnecessary storage is allocated for Character arrays, 
but only when the need arises.  
There are no longer performance based restrictions for the programmer
to subscript strings directly. Hooray! 

Not only to *get* but also to *set*  substrings. 
(The latter would of course require String-inside 
processing of the Character array. updating the
in the String)

Furthermore, one could base nearly all
string handling like substring, replace, search, etc.
directly on this character array without the 
need to walk through the contiguous String storage
itself each time at runtime. 

  
Flexible! So one can do all this and more: 
 str[5] = “x”
 let s = str[5] 
 str[3…5] = “HAL”   
 str[range] = str[range].reversed()  
 var s = str[10..<28]
if str[p1..

Re: [swift-evolution] Strings in Swift 4

2017-02-22 Thread Michael Ilseman via swift-evolution
Given that the behavior you desire is literally a few key strokes away (see 
below), it would be unfortunate to pessimize the internal representation of 
Strings for every application. This would destroy the applicability of the 
Swift standard library to entire areas of computing such as application 
development for mobile devices (Swift's current largest niche). The idea of 
abstraction is that you can provide a high-level view of things stored at a 
lower-level in accordance with sensible higher-level semantics and 
expectations. If you want random access, then you can eagerly project the 
characters (see below). This is consistent with the standard library’s 
preference for lazy sequences when providing a eager one would result in a 
large up-front cost that might be avoidable otherwise.

Here’s playground code that gives you what you’re requesting, by doing an eager 
projection (rather than a lazy one, which is the default):

extension String {
var characterArray: [Character] {
return characters.map { $0 }
}
}
let str = "abcdefg\(UnicodeScalar(0x302)!)"
let charArray = str.characterArray
charArray[4] // results in "e"
charArray[6] // results in "ĝ"

Note that you get random access AND safety by operating at the Character level. 
If you operate at the unicode scalar value level instead, you might be 
splitting canonical combining sequences accidentally.


> On Feb 22, 2017, at 7:56 AM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> Hi Ben,
> thank you, yes, I know all that by now. 
> 
> Have seen that one goes to great lengths to optimise, not only for storage 
> but also for speed. But how far does this need to go?  In any case, 
> optimisation should not be used
> as an argument for restricting a PLs functionality that is to refrain from PL 
> elements which are common and useful.?
> 
> I wouldn’t worry so much over storage (unless one wants to load a complete 
> book into memory… in iOS, the average app is about 15-50 MB, String data is 
> mostly a fraction of that. In macOS or similar I’d think it is even less 
> significant…
> 
> I wonder how much performance and memory consumption would be different from 
> the current contiguous memory implementation?  if a String is just is a plain 
> row of (references to) Character (extended grapheme cluster) objects, 
> Array<[Character>, which would simplify the basic logic and (sub)string 
> handling significantly, because then one has direct access to the String’s 
> elements directly, using the reasonably fast access methods of a Swift 
> Collection/Array. 
> 
> I have experimented  with an alternative String struct based upon 
> Array, seeing how easy it was to implement most popular string 
> handling functions as one can work with the Character array directly. 
> 
> Currently at deep-dive-depth in the standard lib sources, especially String & 
> Co.
> 
> Kind Regards
> TedvG
> 
> 
>> On 21 Feb 2017, at 01:31, Ben Cohen > > wrote:
>> 
>> Hi Ted,
>> 
>> While Character is the Element type for String, it would be unsuitable for a 
>> String’s implementation to actually use Character for storage. Character is 
>> fairly large (currently 9 bytes), very little of which is used for most 
>> values. For unusual graphemes that require more storage, it allocates more 
>> memory on the heap. By contrast, String’s actual storage is a buffer of 1- 
>> or 2-byte elements, and all graphemes (what we expose as Characters) are 
>> held in that contiguous memory no matter how many code points they comprise. 
>> When you iterate over the string, the graphemes are unpacked into a 
>> Character on the fly. This gives you an user interface of a collection that 
>> superficially appears to resemble [Character], but this does not mean that 
>> this would be a workable implementation.
>> 
>>> On Feb 20, 2017, at 12:59 PM, Ted F.A. van Gaalen >> > wrote:
>>> 
>>> Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others
>>> 
>>> As described in the Swift Standard Library API Reference:
>>> 
>>> The Character type represents a character made up of one or more Unicode 
>>> scalar values, 
>>> grouped by a Unicode boundary algorithm. Generally, a Character instance 
>>> matches what 
>>> the reader of a string will perceive as a single character. The number of 
>>> visible characters is 
>>> generally the most natural way to count the length of a string.
>>> The smallest discrete unit we (app programmers) are mostly working with is 
>>> this
>>> perceived visible character, what else? 
>>> 
>>> If that is the case, my reasoning is, that Strings (could / should? ) be 
>>> relatively simple, 
>>> because most, if not all, complexity of Unicode is confined within the 
>>> Character object and
>>> completely hidden**  for the average application programmer, who normally 
>>> only needs
>>> to work with 

Re: [swift-evolution] Strings in Swift 4

2017-02-22 Thread Dave Abrahams via swift-evolution
Ted, that sort of implementation grows many common strings by a factor of 8 and 
makes some less common strings require multiple memory allocations. Considering 
that our research has shown it is a big performance and energy-use win  to 
heroically compress 

  strings to avoid both kinds of bloat (plenty of actual data was gathered 
before tagged pointer strings were added to Cocoa), a scheme like the one 
you're proposing is pretty much a non-starter as far as I'm concerned. 

Sent from my moss-covered three-handled family gradunza

> On Feb 22, 2017, at 5:56 AM, Ted F.A. van Gaalen  
> wrote:
> 
> Hi Ben,
> thank you, yes, I know all that by now. 
> 
> Have seen that one goes to great lengths to optimise, not only for storage 
> but also for speed. But how far does this need to go?  In any case, 
> optimisation should not be used
> as an argument for restricting a PLs functionality that is to refrain from PL 
> elements which are common and useful.?
> 
> I wouldn’t worry so much over storage (unless one wants to load a complete 
> book into memory… in iOS, the average app is about 15-50 MB, String data is 
> mostly a fraction of that. In macOS or similar I’d think it is even less 
> significant…
> 
> I wonder how much performance and memory consumption would be different from 
> the current contiguous memory implementation?  if a String is just is a plain 
> row of (references to) Character (extended grapheme cluster) objects, 
> Array<[Character>, which would simplify the basic logic and (sub)string 
> handling significantly, because then one has direct access to the String’s 
> elements directly, using the reasonably fast access methods of a Swift 
> Collection/Array. 
> 
> I have experimented  with an alternative String struct based upon 
> Array, seeing how easy it was to implement most popular string 
> handling functions as one can work with the Character array directly. 
> 
> Currently at deep-dive-depth in the standard lib sources, especially String & 
> Co.
> 
> Kind Regards
> TedvG
> 
> 
>> On 21 Feb 2017, at 01:31, Ben Cohen  wrote:
>> 
>> Hi Ted,
>> 
>> While Character is the Element type for String, it would be unsuitable for a 
>> String’s implementation to actually use Character for storage. Character is 
>> fairly large (currently 9 bytes), very little of which is used for most 
>> values. For unusual graphemes that require more storage, it allocates more 
>> memory on the heap. By contrast, String’s actual storage is a buffer of 1- 
>> or 2-byte elements, and all graphemes (what we expose as Characters) are 
>> held in that contiguous memory no matter how many code points they comprise. 
>> When you iterate over the string, the graphemes are unpacked into a 
>> Character on the fly. This gives you an user interface of a collection that 
>> superficially appears to resemble [Character], but this does not mean that 
>> this would be a workable implementation.
>> 
>>> On Feb 20, 2017, at 12:59 PM, Ted F.A. van Gaalen  
>>> wrote:
>>> 
>>> Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others
>>> 
>>> As described in the Swift Standard Library API Reference:
>>> 
>>> The Character type represents a character made up of one or more Unicode 
>>> scalar values, 
>>> grouped by a Unicode boundary algorithm. Generally, a Character instance 
>>> matches what 
>>> the reader of a string will perceive as a single character. The number of 
>>> visible characters is 
>>> generally the most natural way to count the length of a string.
>>> The smallest discrete unit we (app programmers) are mostly working with is 
>>> this
>>> perceived visible character, what else? 
>>> 
>>> If that is the case, my reasoning is, that Strings (could / should? ) be 
>>> relatively simple, 
>>> because most, if not all, complexity of Unicode is confined within the 
>>> Character object and
>>> completely hidden**  for the average application programmer, who normally 
>>> only needs
>>> to work with Strings which contains these visible Characters, right? 
>>> It doesn’t then make no difference at all “what’ is in” the Character, 
>>> (excellent implementation btw) 
>>> (Unicode, ASCCII, EBCDIC, Elvish, KlingonIV, IntergalacticV.2, whatever)
>>> because we rely in sublime oblivion for the visually representation of 
>>> whatever is in
>>> the Character on miraculous font processors hidden in the dark depths of 
>>> the OS. 
>>> 
>>> Then, in this perspective, my question is: why is String not implemented as 
>>> directly based upon an array [Character]  ? In that case one can refer to 
>>> the Characters of the
>>> String directly, not only for direct subscripting and other String 
>>> functionality in an efficient way. 
>>> (i do hava scope of independent Swift here, that is interaction with 
>>> libraries should be 
>>> solved by the compiler, so 

Re: [swift-evolution] Strings in Swift 4

2017-02-22 Thread Ted F.A. van Gaalen via swift-evolution
Hi Ben,
thank you, yes, I know all that by now. 

Have seen that one goes to great lengths to optimise, not only for storage but 
also for speed. But how far does this need to go?  In any case, optimisation 
should not be used
as an argument for restricting a PLs functionality that is to refrain from PL 
elements which are common and useful.?

I wouldn’t worry so much over storage (unless one wants to load a complete book 
into memory… in iOS, the average app is about 15-50 MB, String data is mostly a 
fraction of that. In macOS or similar I’d think it is even less significant…

I wonder how much performance and memory consumption would be different from 
the current contiguous memory implementation?  if a String is just is a plain 
row of (references to) Character (extended grapheme cluster) objects, 
Array<[Character>, which would simplify the basic logic and (sub)string 
handling significantly, because then one has direct access to the String’s 
elements directly, using the reasonably fast access methods of a Swift 
Collection/Array. 

I have experimented  with an alternative String struct based upon 
Array, seeing how easy it was to implement most popular string 
handling functions as one can work with the Character array directly. 

Currently at deep-dive-depth in the standard lib sources, especially String & 
Co.

Kind Regards
TedvG


> On 21 Feb 2017, at 01:31, Ben Cohen  wrote:
> 
> Hi Ted,
> 
> While Character is the Element type for String, it would be unsuitable for a 
> String’s implementation to actually use Character for storage. Character is 
> fairly large (currently 9 bytes), very little of which is used for most 
> values. For unusual graphemes that require more storage, it allocates more 
> memory on the heap. By contrast, String’s actual storage is a buffer of 1- or 
> 2-byte elements, and all graphemes (what we expose as Characters) are held in 
> that contiguous memory no matter how many code points they comprise. When you 
> iterate over the string, the graphemes are unpacked into a Character on the 
> fly. This gives you an user interface of a collection that superficially 
> appears to resemble [Character], but this does not mean that this would be a 
> workable implementation.
> 
>> On Feb 20, 2017, at 12:59 PM, Ted F.A. van Gaalen > > wrote:
>> 
>> Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others
>> 
>> As described in the Swift Standard Library API Reference:
>> 
>> The Character type represents a character made up of one or more Unicode 
>> scalar values, 
>> grouped by a Unicode boundary algorithm. Generally, a Character instance 
>> matches what 
>> the reader of a string will perceive as a single character. The number of 
>> visible characters is 
>> generally the most natural way to count the length of a string.
>> The smallest discrete unit we (app programmers) are mostly working with is 
>> this
>> perceived visible character, what else? 
>> 
>> If that is the case, my reasoning is, that Strings (could / should? ) be 
>> relatively simple, 
>> because most, if not all, complexity of Unicode is confined within the 
>> Character object and
>> completely hidden**  for the average application programmer, who normally 
>> only needs
>> to work with Strings which contains these visible Characters, right? 
>> It doesn’t then make no difference at all “what’ is in” the Character, 
>> (excellent implementation btw) 
>> (Unicode, ASCCII, EBCDIC, Elvish, KlingonIV, IntergalacticV.2, whatever)
>> because we rely in sublime oblivion for the visually representation of 
>> whatever is in
>> the Character on miraculous font processors hidden in the dark depths of the 
>> OS. 
>> 
>> Then, in this perspective, my question is: why is String not implemented as 
>> directly based upon an array [Character]  ? In that case one can refer to 
>> the Characters of the
>> String directly, not only for direct subscripting and other String 
>> functionality in an efficient way. 
>> (i do hava scope of independent Swift here, that is interaction with 
>> libraries should be 
>> solved by the compiler, so as not to be restricted by legacy ObjC etc. 
>> 
>> **   (expect if one needs to do e.g. access individual elements and/or 
>> compose graphics directly?
>>   but for  this purpose the Character’s properties are accessible) 
>> 
>> For the sake of convenience, based upon the above reasoning,  I now 
>> “emulate" this in 
>> a string extension, thereby ignoring the rare cases that a visible character 
>> could be based 
>> upon more than a single Character (extended grapheme cluster)  If that would 
>> occur, 
>> thye should be merged into one extended grapheme cluster, a single Character 
>> that is. 
>> 
>> //: Playground - implement direct subscripting using a Character array
>> // of course, when the String is defined as an array of Characters, directly
>> // accessible it would be more 

Re: [swift-evolution] Strings in Swift 4

2017-02-20 Thread Ben Cohen via swift-evolution
Hi Ted,

While Character is the Element type for String, it would be unsuitable for a 
String’s implementation to actually use Character for storage. Character is 
fairly large (currently 9 bytes), very little of which is used for most values. 
For unusual graphemes that require more storage, it allocates more memory on 
the heap. By contrast, String’s actual storage is a buffer of 1- or 2-byte 
elements, and all graphemes (what we expose as Characters) are held in that 
contiguous memory no matter how many code points they comprise. When you 
iterate over the string, the graphemes are unpacked into a Character on the 
fly. This gives you an user interface of a collection that superficially 
appears to resemble [Character], but this does not mean that this would be a 
workable implementation.

> On Feb 20, 2017, at 12:59 PM, Ted F.A. van Gaalen  
> wrote:
> 
> Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others
> 
> As described in the Swift Standard Library API Reference:
> 
> The Character type represents a character made up of one or more Unicode 
> scalar values, 
> grouped by a Unicode boundary algorithm. Generally, a Character instance 
> matches what 
> the reader of a string will perceive as a single character. The number of 
> visible characters is 
> generally the most natural way to count the length of a string.
> The smallest discrete unit we (app programmers) are mostly working with is 
> this
> perceived visible character, what else? 
> 
> If that is the case, my reasoning is, that Strings (could / should? ) be 
> relatively simple, 
> because most, if not all, complexity of Unicode is confined within the 
> Character object and
> completely hidden**  for the average application programmer, who normally 
> only needs
> to work with Strings which contains these visible Characters, right? 
> It doesn’t then make no difference at all “what’ is in” the Character, 
> (excellent implementation btw) 
> (Unicode, ASCCII, EBCDIC, Elvish, KlingonIV, IntergalacticV.2, whatever)
> because we rely in sublime oblivion for the visually representation of 
> whatever is in
> the Character on miraculous font processors hidden in the dark depths of the 
> OS. 
> 
> Then, in this perspective, my question is: why is String not implemented as 
> directly based upon an array [Character]  ? In that case one can refer to the 
> Characters of the
> String directly, not only for direct subscripting and other String 
> functionality in an efficient way. 
> (i do hava scope of independent Swift here, that is interaction with 
> libraries should be 
> solved by the compiler, so as not to be restricted by legacy ObjC etc. 
> 
> **   (expect if one needs to do e.g. access individual elements and/or 
> compose graphics directly?
>   but for  this purpose the Character’s properties are accessible) 
> 
> For the sake of convenience, based upon the above reasoning,  I now “emulate" 
> this in 
> a string extension, thereby ignoring the rare cases that a visible character 
> could be based 
> upon more than a single Character (extended grapheme cluster)  If that would 
> occur, 
> thye should be merged into one extended grapheme cluster, a single Character 
> that is. 
> 
> //: Playground - implement direct subscripting using a Character array
> // of course, when the String is defined as an array of Characters, directly
> // accessible it would be more efficient as in these extension functions. 
> extension String
> {
> var count: Int
> {
> get
> {
> return self.characters.count
> }
> }
> 
> subscript (n: Int) -> String
> {
> return String(Array(self.characters)[n])
> }
> 
> subscript (r: Range) -> String
> {
> return String(Array(self.characters)[r])
> }
> 
> subscript (r: ClosedRange) -> String
> {
> return String(Array(self.characters)[r])
> }
> }
> 
> func test()
> {
> let zoo = "Koala , Snail , Penguin , Dromedary "
> print("zoo has \(zoo.count) characters (discrete extended graphemes):")
> for i in 0.. {
> print(i,zoo[i],separator: "=", terminator:" ")
> }
> print("\n")
> print(zoo[0..<7])
> print(zoo[9..<16])
> print(zoo[18...26])
> print(zoo[29...39])
> print("images:" + zoo[6] + zoo[15] + zoo[26] + zoo[39])
> }
> 
> test()
> 
> this works as intended  and generates the following output:  
> 
> zoo has 40 characters (discrete extended graphemes):
> 0=K 1=o 2=a 3=l 4=a 5=  6= 7=, 8=  9=S 10=n 11=a 12=i 13=l 14=  15= 16=, 
> 17=  
> 18=P 19=e 20=n 21=g 22=u 23=i 24=n 25=  26= 27=, 28=  29=D 30=r 31=o 32=m 
> 33=e 34=d 35=a 36=r 37=y 38=  39= 
> 
> Koala 
> Snail 
> Penguin 
> Dromedary 
> images:
> 
> I don’t know how (in) efficient this method is. 
> but in many cases this is not so important as e.g. with numerical computation.
> 
> I still fail to understand why direct subscripting 

Re: [swift-evolution] Strings in Swift 4

2017-02-20 Thread Ted F.A. van Gaalen via swift-evolution
Hi Ben, Dave (you should not read this now, you’re on vacation :o)  & Others

As described in the Swift Standard Library API Reference:

The Character type represents a character made up of one or more Unicode scalar 
values, 
grouped by a Unicode boundary algorithm. Generally, a Character instance 
matches what 
the reader of a string will perceive as a single character. The number of 
visible characters is 
generally the most natural way to count the length of a string.
The smallest discrete unit we (app programmers) are mostly working with is this
perceived visible character, what else? 

If that is the case, my reasoning is, that Strings (could / should? ) be 
relatively simple, 
because most, if not all, complexity of Unicode is confined within the 
Character object and
completely hidden**  for the average application programmer, who normally only 
needs
to work with Strings which contains these visible Characters, right? 
It doesn’t then make no difference at all “what’ is in” the Character, 
(excellent implementation btw) 
(Unicode, ASCCII, EBCDIC, Elvish, KlingonIV, IntergalacticV.2, whatever)
because we rely in sublime oblivion for the visually representation of whatever 
is in
the Character on miraculous font processors hidden in the dark depths of the 
OS. 

Then, in this perspective, my question is: why is String not implemented as 
directly based upon an array [Character]  ? In that case one can refer to the 
Characters of the
String directly, not only for direct subscripting and other String 
functionality in an efficient way. 
(i do hava scope of independent Swift here, that is interaction with libraries 
should be 
solved by the compiler, so as not to be restricted by legacy ObjC etc. 

**   (expect if one needs to do e.g. access individual elements and/or compose 
graphics directly?
  but for  this purpose the Character’s properties are accessible) 

For the sake of convenience, based upon the above reasoning,  I now “emulate" 
this in 
a string extension, thereby ignoring the rare cases that a visible character 
could be based 
upon more than a single Character (extended grapheme cluster)  If that would 
occur, 
thye should be merged into one extended grapheme cluster, a single Character 
that is. 

//: Playground - implement direct subscripting using a Character array
// of course, when the String is defined as an array of Characters, directly
// accessible it would be more efficient as in these extension functions. 
extension String
{
var count: Int
{
get
{
return self.characters.count
}
}

subscript (n: Int) -> String
{
return String(Array(self.characters)[n])
}

subscript (r: Range) -> String
{
return String(Array(self.characters)[r])
}

subscript (r: ClosedRange) -> String
{
return String(Array(self.characters)[r])
}
}

func test()
{
let zoo = "Koala , Snail , Penguin , Dromedary "
print("zoo has \(zoo.count) characters (discrete extended graphemes):")
for i in 0..___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-14 Thread Olivier Tardieu via swift-evolution
As suggested, I created a pull request for the String manifesto adding an 
unsafe String API discussion.
https://github.com/apple/swift/pull/7479

I included in the comments a tentative implementation in Swift 3.
https://gist.github.com/tardieu/7ca43d19b6197033dc39b138ba0e500e

I focused for now on the most essential capabilities that, hopefully, are 
not too controversial.

Regards,

Olivier


dabrah...@apple.com wrote on 01/31/2017 02:23:49 PM:

> From: Dave Abrahams <dabrah...@apple.com>
> To: Olivier Tardieu/Watson/IBM@IBMUS
> Cc: Ben Cohen <ben_co...@apple.com>, swift-evolution  evolut...@swift.org>
> Date: 01/31/2017 02:24 PM
> Subject: Re: [swift-evolution] Strings in Swift 4
> Sent by: dabrah...@apple.com
> 
> 
> on Mon Jan 30 2017, Olivier Tardieu  wrote:
> 
> > Thanks for the clarifications.
> > More comments below.
> >
> > dabrah...@apple.com wrote on 01/24/2017 05:50:59 PM:
> >
> >> Maybe it wasn't clear from the document, but the intention is that
> >> String would be able to use any model of Unicode as a backing store, 
and
> >> that you could easily build unsafe models of Unicode... but also that
> >> you could use your unsafe model of Unicode directly, in string-ish 
ways.
> >
> > I see. If I understand correctly, it will be possible for instance to 
> > implement an unsafe model of Unicode with a UInt8 code unit and a 
> > maxLengthOfEncodedScalar equal to 1 by only keeping the 8 lowest bits 
of 
> > Unicode scalars.
> 
> Eh... I think you'd just use an unsafe Latin-1 for that; why waste a
> bit?
> 
> Here's an example (work very much in-progress):
> https://github.com/apple/swift/blob/
> 
9defe9ded43c6f480f82a28d866ec73d803688db/test/Prototypes/Unicode.swift#L877
> 
> 
> >> > A lot of machine processing of strings continues to deal with 8-bit
> >> > quantities (even 7-bit quantities, not UTF-8).  Swift strings are
> >> > not very good at that. I see progress in the manifesto but nothing
> >> > to really close the performance gap with C.  That's where "unsafe"
> >> > mechanisms could come into play.
> >> 
> >> extendedASCII is supposed to address that.  Given a smart enough
> >> optimizer, it should be possible to become competitive with C even
> >> without using unsafe constructs.  However, we recognize the 
importance
> >> of being able to squeeze out that last bit of performance by dropping
> >> down to unsafe storage.
> >
> > I doubt a 32-bit encoding can bridge the performance gap with C in
> > particular because wire protocols will continue to favor compact
> > encodings.  Incoming strings will have to be expanded to the
> > extendedASCII representation before processing and probably compacted
> > afterwards. So while this may address the needs of computationally
> > intensive string processing tasks, this does not help simple parsing
> > tasks on simple strings.
> 
> I'm pretty sure it does; we're not going to change representations
> 
> extendedASCII doesn't require anything to actually be expanded to
> 32-bits per code unit, except *maybe* in a register, and then only if
> the optimizer isn't smart enough to eliminate zero-extension followed by
> comparison with a known narrow value.  You can always
> 
>   latin1.lazy.map { UInt32($0) }
> 
> to produce 32-bit code units.  All the common encodings are ASCII
> supersets, so this will “just work” for those.  The only places where it
> becomes more complicated is in encodings like Shift-JIS (which might not
> even be important enough to support as a String backing-storage format).
> 
> >
> >> > To guarantee Unicode correctness, a C string must be validated or 
> >> > transformed to be considered a Swift string.
> >> 
> >> Not really.  You can do error-correction on the fly.  However, I 
think
> >> pre-validation is often worthwhile because once you know something is
> >> valid it's much cheaper to decode correctly (especially for UTF-8).
> >
> > Sure. Eager vs. lazy validation is a valuable distinction, but what I 
am 
> > after here is side-stepping validation altogether. I understand now 
that 
> > user-defined encodings will make side-stepping validation possible.
> 
> Right.
> 
> >
> >> > If I understand the C String interop section correctly, in Swift 4,
> >> > this should not force a copy, but traversing the string is still
> >> > required. 
> >> 
> >> *What* should not force a copy?
> >
> > I would like to have a constructor that takes a pointer to a 
> > null-terminated se

Re: [swift-evolution] Strings in Swift 4

2017-02-13 Thread Ronald Bell via swift-evolution

> On Feb 10, 2017, at 12:38 PM, Hooman Mehr via swift-evolution 
>  wrote:
> 
> 
>> On Feb 9, 2017, at 6:50 PM, Shawn Erickson > > wrote:
>> 
>> 
>> 
>> On Thu, Feb 9, 2017 at 3:45 PM Hooman Mehr > > wrote:
>>> On Feb 9, 2017, at 3:11 PM, Dave Abrahams >> > wrote:
>>> 
>>> 
>>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" >> > wrote:
>>> 
 Hello Shawn
 Just google with any programming language name and “string manipulation”
 and you have enough reading for a week or so :o)
 TedvG
>>> 
>>> That truly doesn't answer the question.  It's not, “why do people index
>>> strings with integers when that's the only tool they are given for
>>> decomposing strings?”  It's, “what do you have to do with strings that's
>>> hard in Swift *because* you can't index them with integers?”
>> 
>> I have done some string processing. I have not encountered any algorithm 
>> where an integer index is absolutely needed, but sometimes it might be the 
>> most convenient. 
>> 
>> For example, there are valid reasons to keep side tables that hold indexes 
>> into a string. (such as maintaining attributes that apply to a substring or 
>> things like pre-computed positions of soft line breaks). It does not require 
>> the index to be integer, but maintaining validity of those indexes after the 
>> string is mutated requires being able to offset them back or forth from some 
>> position on. These operations could be less verbose and easier if the index 
>> happens to be integer or (efficiently) supports + - operators. Also, I know 
>> there are other methods to deal with such things and mutating a large string 
>> generally is a bad idea, but sometimes it is the easiest and most convenient 
>> solution to the problem at hand.
>> 
>>  The end goal of this string is for human consumption right? So such 
>> manipulation would need need to unicode aware in the modern world? ..or is 
>> it for some other reason?
>> 
>> -Shawn
> 
> For an example of what I mean, see the source code of 
> NS(Mutable)AttributedString 
> 
>  and note how most of the mutating methods of Mutable variant are not 
> implemented yet...
> 
> So, a good example of where such indexing would be convenient, could be 
> writing a swift-native AttributedString backed by Swift native String.

On that last topic, NSAttributedString has always seemed like a strange design 
— a class with a bunch of attribute methods with a property that lets you 
interrogate the String behind it all. 

It always seemed to me that there was a false distinction. Strings should have 
optional properties, the way GameplayKit does GKEntities and GKComponents.

Simple Strings would return nil if you asked them for their attributes.

Attributed Strings would return attributes.

I think it would be a lot more intuitive how to parse an attributed string in 
blocks and then refer back to the attributes of each chunk, for one thing.

Is there a reason why composition was chosen to be the way it is in 
NSAttributedString, instead?

- Ron___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-13 Thread Ted F.A. van Gaalen via swift-evolution

> On 11 Feb 2017, at 18:33, Dave Abrahams  wrote:
> 
> All of these examples should be efficiently and expressively handled by the 
> pattern matching API mentioned in the proposal. They definitely do not 
> require random access or integer indexing. 
> 
Hi Dave, 
then I am very interested to know how to unpack aString (e.g. read from a file 
record such as in the previous example:
123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453 ) 
without using direct subscripting like str[n1…n2) ? 
(which btw is for me the most straightforward and ideal method) 
conditions:
   -The source string contains fields of known position (offset) and length, 
concatenated together
without any separators (like in a CSV)
  -the  contents of each field is unpredictable. 
   which excludes the use of pattern-matching. 
   -the source string needs to be unpacked in independent strings. 

I made this example: (the comments also stress my point) 
//: Playground - noun: a place outside the mean and harsh production environment
//   No presidents were harmed during the production of this example. 
import UIKit
import Foundation

// The following String extension with subscriptor "direct access"
// functionality, included in in almost each and every app I create,
// wouldn't be necessary if str[a.. String
{
guard i >= 0 && i < characters.count else { return "" }
return String(self[index(startIndex, offsetBy: i)])
}

subscript(range: Range) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), 
limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: 
range.upperBound - range.lowerBound, limitedBy: endIndex) ?? endIndex))
}
 
subscript(range: ClosedRange) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), 
limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: 
range.upperBound - range.lowerBound + 1, limitedBy: endIndex) ?? endIndex))
}
}
// In the following example, the record's field positions and lengths are fixed 
format
// and will never change.
// Also, the record's contents has been validated completely by the sending 
application.

// Normally it is an input record, read from a storage medium,
// however for the purpose of this example it is defined here:

let record = "123A.534.CMCU3Arduino Due Arm 32-bit Micro controller.  
000341000568002250$"

// Define a product data structure:
struct Product
{
var id :String // is key
var group: String
var name: String
var description : String
var inStock: Int
var ordered : Int
var price: Int// in cents: no Money data type in Swift available.
var currency: String

// of course one could use "set/get" properties here
// which could validate the input to this structure.

var priceFormatted: String  // computed property.
{
get
{
let whole = (price / 100)
let cents = price - (whole * 100)
return currency + " \(whole).\(cents)"
}
}

// TODO: disable other default initiators.
init(inputrecord: String)
{
   id  = inputrecord[ 0..<10]
   group   = inputrecord[10..<14]
   name= inputrecord[14..<30]
   description = inputrecord[30..<60]
   inStock = Int(inputrecord[60..<70])!
   ordered = Int(inputrecord[70..<80])!
   price   = Int(inputrecord[80..<90])!
   currency= inputrecord[90]
}

// Add necessary business and DB logic for products here.
}


func test()
{
let product = Product(inputrecord: record)

print("=== Product data for the item with ID: \(product.id) ")
print("ID : \(product.id)")
print("group  : \(product.group)")

Re: [swift-evolution] Strings in Swift 4

2017-02-12 Thread Ben Cohen via swift-evolution
Hi Ted,

Dave is on vacation next two weeks so this is a reply on behalf of both him and 
me:

> On Feb 12, 2017, at 10:17, "Ted F.A. van Gaalen"  > wrote:

>> On 11 Feb 2017, at 18:33, Dave Abrahams > > wrote:
>> 
>> All of these examples should be efficiently and expressively handled by the 
>> pattern matching API mentioned in the proposal. They definitely do not 
>> require random access or integer indexing. 
>> 
> Hi Dave, 
> then I am very interested to know how to unpack aString (e.g. read from a 
> file record such as in the previous example:
> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453 ) 
> without using direct subscripting like str[n1…n2) ? 

If you look again at the code I sent previously, it demonstrates how you can 
use lengths to move forward through a string without needing random access for 
your particular use case.

> (which btw is for me the most straightforward and ideal method) 
> conditions:
>-The source string contains fields of known position (offset) and length, 
> concatenated together
> without any separators (like in a CSV)
>   -the  contents of each field is unpredictable. 
>which excludes the use of pattern-matching. 

Pattern matching isn’t just about matching known contents. Think of the regex 
“...”. This is a pattern matches any 3 characters. While full regex support is 
out of scope for the current discussions, the intention is for the pattern 
matching part of the proposal to handle this kind of use case.

>-the source string needs to be unpacked in independent strings. 
> 
> I made this example: (the comments also stress my point) 
> 

Here is another way of implementing your example in a form that doesn’t require 
random access.

Putting aside pattern matching for now, assume that there is an API on String 
that lets you drop a specific-length prefix from a Substring (for now in Swift 
3, that's a String). An API like this (probably taking any pattern as its 
argument, not just a length) is likely to be proposed to evolution soon once we 
move into that phase of the 4.0 String project.

// this particular API/implementation for demonstration only,
// not necessarily quite what will be proposed
extension Collection where SubSequence == Self {
/// Drop n elements from the front of `self` in-place,
/// returning the dropped prefix.
mutating func dropPrefix(_ n: IndexDistance) -> SubSequence {
// nature of error handling/swallowing/trapping/optional
// returning here TBD...
let newStart = index(startIndex, offsetBy: n)
defer { self = self[newStart.. Isn’t that an elegant solution or what? 

Unfortunately not. Adding integer subscripting to String via an extension that 
uses index(_:offsetBy) is a commonly proposed idea that we strongly caution 
against. Strings use an opaque
index rather than integers for a reason, it’s not an oversight.

The reason being: if ever your string contains more than just ASCII characters, 
then advancing a String's startIndex to the nth element becomes a 

Re: [swift-evolution] Strings in Swift 4

2017-02-11 Thread Dave Abrahams via swift-evolution


Sent from my moss-covered three-handled family gradunza

> On Feb 11, 2017, at 5:16 AM, Karl Wagner  wrote:
> 
> 
>>> On 11 Feb 2017, at 04:23, Brent Royal-Gordon  wrote:
>>> 
>>> On Feb 10, 2017, at 5:49 PM, Jonathan Hull via swift-evolution 
>>>  wrote:
>>> 
>>> An easier to implement, but slightly less useful approach, would be to have 
>>> methods which take an array of indexes along with the proposed change, and 
>>> then it adjusts the indexes (or replaces them with nil if they are invalid) 
>>> as it makes the update.  For example:
>>> 
>>>func append(_ element:Element, adjusting: [Index]) -> [Index?]
>>>func appending(_ element:Element, adjusting: [Index]) -> (Self, [Index?])
>> 
>> This is a very interesting idea. A couple observations:
>> 
>> 1. The problem of adjusting indices is not just a String one. It also 
>> applies to Array, among other things.

You can think of this as a generalization of AttributedString
>> 2. This logic could be encapsulated and reused in a separate type. For 
>> instance, imagine:
>> 
>>let myStringProxy = IndexTracking(collection: myString, trackedIndices: 
>> [someIndex, otherIndex])
>>myStringProxy.insert("foo", at: otherIndex)
>>(someIndex, otherIndex) = (stringProxy.trackedIndices[0], 
>> stringProxy.trackedIndices[1])
>> 
>> Or, with a helper method:
>> 
>>myString.withTracked() { myStringProxy in
>>myStringProxy.insert("foo", at: otherIndex)
>>}

You can't adjust indices in arbitrary RangeReplaceableCollections without 
penalizing the performance of all RangeReplaceableCollections. Also, to do it 
without introducing reference semantics you need to bundle the index storage 
with the collection or explicitly make the collection of indices to be updated 
avAilable input to the range replacement methods. Given the latter, you could 
build something like this to implement the former:

struct IndexTracked 

Also, you probably want his thing to adjust ranges rather than indices because 
otherwise you need to decide whether to adjust an index when there is an 
insertion at that position.   Does it stick to the left or right element?

>> 3. An obstacle to doing this correctly is that a collection's index 
>> invalidation behavior is not expressed in the type system.

I don't see why that's an issue. 

>> If there were a protocol like:
>> 
>>protocol RangeReplaceableWithEarlierIndexesStableCollection: 
>> RangeReplaceableCollection {}

There's one interesting wrinkle on invalidation I discovered recently: there is 
an important class of indices that are not invalidated as positions when they 
precede the change, but may be invalidated for movement: those that store some 
cached information about following elements, such as transcoded Unicode code 
units. 

>> 
>> That would help us here.
>> 
>> -- 
>> Brent Royal-Gordon
>> Architechies
>> 
> 
> 
> I mentioned this much earlier in the thread. My preferred solution would be 
> some kind of RRC-like protocol where mutating methods returned an associated 
> “IndexDisplacement” type. That IndexDisplacement would store, for each 
> operation, the offset and number of index-positions which have been 
> inserted/removed, and know how to translate an index in the previous state in 
> to one in the new state.
> 
> You would still need to manually adjust your stored indexes using that 
> IndexDisplacement, but it’d be less error-prone as the logic is written for 
> you.
> 
> The standard (non-IndexDisplacement-returning) RRC methods could then be 
> implemented as wrappers which discard the displacement.
> 
> - Karl
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-11 Thread Dave Abrahams via swift-evolution


Sent from my moss-covered three-handled family gradunza

> On Feb 9, 2017, at 3:45 PM, Hooman Mehr  wrote:
> 
> 
>> On Feb 9, 2017, at 3:11 PM, Dave Abrahams  wrote:
>> 
>> 
>> on Thu Feb 09 2017, "Ted F.A. van Gaalen"  wrote:
>> 
>>> Hello Shawn
>>> Just google with any programming language name and “string manipulation”
>>> and you have enough reading for a week or so :o)
>>> TedvG
>> 
>> That truly doesn't answer the question.  It's not, “why do people index
>> strings with integers when that's the only tool they are given for
>> decomposing strings?”  It's, “what do you have to do with strings that's
>> hard in Swift *because* you can't index them with integers?”
> 
> I have done some string processing. I have not encountered any algorithm 
> where an integer index is absolutely needed, but sometimes it might be the 
> most convenient. 
> 
> For example, there are valid reasons to keep side tables that hold indexes 
> into a string. (such as maintaining attributes that apply to a substring or 
> things like pre-computed positions of soft line breaks). It does not require 
> the index to be integer, but maintaining validity of those indexes after the 
> string is mutated requires being able to offset them back or forth from some 
> position on. These operations could be less verbose and easier if the index 
> happens to be integer or (efficiently) supports + - operators. Also, I know 
> there are other methods to deal with such things and mutating a large string 
> generally is a bad idea, but sometimes it is the easiest and most convenient 
> solution to the problem at hand.

As noted in the manifesto, it will be trivial to translate string indices 
to/from integer code unit offsets, most likely by a property / an init

String indices as proposed will not however support +/-

> 
>> 
 On 9 Feb 2017, at 16:48, Shawn Erickson  wrote:
 
 I also wonder what folks are actually doing that require indexing
 into strings. I would love to see some real world examples of what
 and why indexing into a string is needed. Who is the end consumer of
 that string, etc.
 
 Do folks have so examples?
 
 -Shawn
 
 On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
 > wrote:
 Hello Hooman
 That invalidates my assumptions, thanks for evaluating
 it's more complex than I thought.
 Kind Regards
 Ted
 
>> On 8 Feb 2017, at 00:07, Hooman Mehr > > wrote:
>> 
>> 
>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
>> > wrote:
>> 
>> I now assume that:
>>  1. -= a “plain” Unicode character (codepoint?)  can result in one 
>> glyph.=-
> 
> What do you mean by “plain”? Characters in some Unicode scripts are
> by no means “plain”. They can affect (and be affected by) the
> characters around them, they can cause glyphs around them to
> rearrange or combine (like ligatures) or their visual
> representation (glyph) may float in the same space as an adjacent
> glyph (and seem to be part of the “host” glyph), etc. So, the
> general relationship of a character and its corresponding glyph (if
> there is one) is complex and depends on context and surroundings
> characters.
> 
>>  2. -= a  grapheme cluster always results in just a single glyph, 
>> true? =- 
> 
> False
> 
>>  3. The only thing that I can see on screen or print are glyphs 
>> (“carvings”,visual elements that stand on their own )
> 
> The visible effect might not be a visual shape. It may be for example, 
> the way the surrounding shapes change or re-arrange.
> 
>> 4.  In this context, a glyph is a humanly recognisable visual form 
>> of a character,
> 
> Not in a straightforward one to one fashion, not even in Latin / Roman 
> script.
> 
>> 5. On this level (the glyph, what I can see as a user) it is not 
>> relevant and also not detectable
>> with how many Unicode scalars (codepoints ?), grapheme, or even 
>> on what kind
>> of encoding the glyph was based upon.
> 
> False
> 
 
 ___
 swift-evolution mailing list
 swift-evolution@swift.org 
 https://lists.swift.org/mailman/listinfo/swift-evolution
>>> 
>>> 
>> 
>> -- 
>> -Dave
> 
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-11 Thread Dave Abrahams via swift-evolution
One of the major (so far unstated) goals of the String rethink is to eliminate 
reasons for people to process textual data outside of String, though. You 
shouldn't have to use an array of bytes to get performance processing of ASCII, 
for example.  

Sent from my moss-covered three-handled family gradunza

> On Feb 9, 2017, at 6:56 PM, Shawn Erickson  wrote:
> 
> 
> On Thu, Feb 9, 2017 at 5:09 PM Ted F.A. van Gaalen  
> wrote:
>>> On 10 Feb 2017, at 00:11, Dave Abrahams  wrote:
>>> 
>>> 
>>> on Thu Feb 09 2017, "Ted F.A. van Gaalen"  wrote:
>>> 
 Hello Shawn
 Just google with any programming language name and “string manipulation”
 and you have enough reading for a week or so :o)
 TedvG
>>> 
>>> That truly doesn't answer the question.  It's not, “why do people index
>>> strings with integers when that's the only tool they are given for
>>> decomposing strings?”  It's, “what do you have to do with strings that's
>>> hard in Swift *because* you can't index them with integers?”
>> 
>> Hi Dave,
>> Ok. here are just a few examples: 
>> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13 
>> perhaps? 
>> of many of the typical combined article codes and product IDs that many 
>> factories and shops use? 
>> 
>> or: 
>> 
>> E.g. processing legacy files from IBM mainframes:
>> extract fields from ancient data records read from very old sequential files,
>> say, a product data record like this from a file from 1978 you’d have to 
>> unpack and process:   
>> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
>> into:
>> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM, 
>> Z3453.
>> product category, pcs, discount code, product code, price Yen, price $, 
>> class code, etc… 
>> in Cobol and PL/1 records are nearly always defined with a fixed field 
>> layout like this.:
>> (storage was limited and very, very expensive, e.g. XML would be regarded as 
>> a 
>> "scandalous waste" even the commas in CSV files! ) 
>> 
>> 01  MAILING-RECORD.
>>05  COMPANY-NAMEPIC X(30).
>>05  CONTACTS.
>>10  PRESIDENT.
>>15  LAST-NAME   PIC X(15).
>>15  FIRST-NAME  PIC X(8).
>>10  VP-MARKETING.
>>15  LAST-NAME   PIC X(15).
>>15  FIRST-NAME  PIC X(8).
>>10  ALTERNATE-CONTACT.
>>15  TITLE   PIC X(10).
>>15  LAST-NAME   PIC X(15).
>>15  FIRST-NAME  PIC X(8).
>>05  ADDRESS PIC X(15).
>>05  CITYPIC X(15).
>>05  STATE   PIC XX.
>>05  ZIP PIC 9(5).
>> 
>> These are all character data fields here, except for the numeric ZIP field , 
>> however in Cobol it can be treated like character data. 
>> So here I am, having to get the data of these old Cobol production files
>> into a brand new Swift based accounting system of 2017, what can I do?   
>> 
>> How do I unpack these records and being the data into a Swift structure or 
>> class? 
>> (In Cobol I don’t have to because of the predefined fixed format record 
>> layout).
>> 
>> AFAIK there are no similar record structures with fixed fields like this 
>> available Swift?
>> 
>> So, the only way I can think of right now is to do it like this:
>> 
>> // mailingRecord is a Swift structure
>> struct MailingRecord
>> {
>> var  companyName: String = “no Name”
>>  var contacts: CompanyContacts
>>  .
>>  etc.. 
>> }
>> 
>> // recordStr was read here with ASCII encoding
>> 
>> // unpack data in to structure’s properties, in this case all are Strings
>> mailingRecord.companyName   = recordStr[ 0..<30]
>> mailingRecord.contacts.president.lastName  = recordStr[30..<45]
>> mailingRecord.contacts.president.firstName = recordStr[45..<53]
>> 
>> 
>> // and so on..
>> 
>> Ever worked for e.g. a bank with thousands of these files unchanged formats 
>> for years?
>> 
>> Any alternative, convenient en simpler methods in Swift present? 
> These looks like examples of fix data format that could be parsed from a byte 
> buffer into strings, etc. Likely little need to force them via a higher order 
> string concept, at least not until unpacked from its compact byte form.
> 
> -Shawn 
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-11 Thread Dave Abrahams via swift-evolution
All of these examples should be efficiently and expressively handled by the 
pattern matching API mentioned in the proposal. They definitely do not require 
random access or integer indexing. 

Sent from my moss-covered three-handled family gradunza

> On Feb 9, 2017, at 5:09 PM, Ted F.A. van Gaalen  wrote:
> 
> 
>> On 10 Feb 2017, at 00:11, Dave Abrahams  wrote:
>> 
>> 
>> on Thu Feb 09 2017, "Ted F.A. van Gaalen"  wrote:
>> 
>>> Hello Shawn
>>> Just google with any programming language name and “string manipulation”
>>> and you have enough reading for a week or so :o)
>>> TedvG
>> 
>> That truly doesn't answer the question.  It's not, “why do people index
>> strings with integers when that's the only tool they are given for
>> decomposing strings?”  It's, “what do you have to do with strings that's
>> hard in Swift *because* you can't index them with integers?”
> 
> Hi Dave,
> Ok. here are just a few examples: 
> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13 
> perhaps? 
> of many of the typical combined article codes and product IDs that many 
> factories and shops use? 
> 
> or: 
> 
> E.g. processing legacy files from IBM mainframes:
> extract fields from ancient data records read from very old sequential files,
> say, a product data record like this from a file from 1978 you’d have to 
> unpack and process:   
> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
> into:
> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM, 
> Z3453.
> product category, pcs, discount code, product code, price Yen, price $, class 
> code, etc… 
> in Cobol and PL/1 records are nearly always defined with a fixed field layout 
> like this.:
> (storage was limited and very, very expensive, e.g. XML would be regarded as 
> a 
> "scandalous waste" even the commas in CSV files! ) 
> 
> 01  MAILING-RECORD.
>05  COMPANY-NAMEPIC X(30).
>05  CONTACTS.
>10  PRESIDENT.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  VP-MARKETING.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  ALTERNATE-CONTACT.
>15  TITLE   PIC X(10).
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>05  ADDRESS PIC X(15).
>05  CITYPIC X(15).
>05  STATE   PIC XX.
>05  ZIP PIC 9(5).
> 
> These are all character data fields here, except for the numeric ZIP field , 
> however in Cobol it can be treated like character data. 
> So here I am, having to get the data of these old Cobol production files
> into a brand new Swift based accounting system of 2017, what can I do?   
> 
> How do I unpack these records and being the data into a Swift structure or 
> class? 
> (In Cobol I don’t have to because of the predefined fixed format record 
> layout).
> 
> AFAIK there are no similar record structures with fixed fields like this 
> available Swift?
> 
> So, the only way I can think of right now is to do it like this:
> 
> // mailingRecord is a Swift structure
> struct MailingRecord
> {
> var  companyName: String = “no Name”
>  var contacts: CompanyContacts
>  .
>  etc.. 
> }
> 
> // recordStr was read here with ASCII encoding
> 
> // unpack data in to structure’s properties, in this case all are Strings
> mailingRecord.companyName   = recordStr[ 0..<30]
> mailingRecord.contacts.president.lastName  = recordStr[30..<45]
> mailingRecord.contacts.president.firstName = recordStr[45..<53]
> 
> 
> // and so on..
> 
> Ever worked for e.g. a bank with thousands of these files unchanged formats 
> for years?
> 
> Any alternative, convenient en simpler methods in Swift present? 
> 
> Kind Regards
> TedvG
> ( example of the above Cobol record borrowed from here: 
>  
> http://www.3480-3590-data-conversion.com/article-reading-cobol-layouts-1.html 
>  ) 
> 
> 
> 
> 
>> 
 On 9 Feb 2017, at 16:48, Shawn Erickson  wrote:
 
 I also wonder what folks are actually doing that require indexing
 into strings. I would love to see some real world examples of what
 and why indexing into a string is needed. Who is the end consumer of
 that string, etc.
 
 Do folks have so examples?
 
 -Shawn
 
 On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
 > wrote:
 Hello Hooman
 That invalidates my assumptions, thanks for evaluating
 it's more complex than I thought.
 Kind Regards
 Ted
 
>> On 8 Feb 2017, at 00:07, Hooman Mehr > > wrote:
>> 
>> 
>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via 

Re: [swift-evolution] Strings in Swift 4

2017-02-11 Thread Karl Wagner via swift-evolution

> On 11 Feb 2017, at 04:23, Brent Royal-Gordon  wrote:
> 
>> On Feb 10, 2017, at 5:49 PM, Jonathan Hull via swift-evolution 
>>  wrote:
>> 
>> An easier to implement, but slightly less useful approach, would be to have 
>> methods which take an array of indexes along with the proposed change, and 
>> then it adjusts the indexes (or replaces them with nil if they are invalid) 
>> as it makes the update.  For example:
>> 
>>  func append(_ element:Element, adjusting: [Index]) -> [Index?]
>>  func appending(_ element:Element, adjusting: [Index]) -> (Self, 
>> [Index?])
> 
> This is a very interesting idea. A couple observations:
> 
> 1. The problem of adjusting indices is not just a String one. It also applies 
> to Array, among other things.
> 
> 2. This logic could be encapsulated and reused in a separate type. For 
> instance, imagine:
> 
>   let myStringProxy = IndexTracking(collection: myString, trackedIndices: 
> [someIndex, otherIndex])
>   myStringProxy.insert("foo", at: otherIndex)
>   (someIndex, otherIndex) = (stringProxy.trackedIndices[0], 
> stringProxy.trackedIndices[1])
> 
> Or, with a helper method:
> 
>   myString.withTracked() { myStringProxy in
>   myStringProxy.insert("foo", at: otherIndex)
>   }
> 
> 3. An obstacle to doing this correctly is that a collection's index 
> invalidation behavior is not expressed in the type system. If there were a 
> protocol like:
> 
>   protocol RangeReplaceableWithEarlierIndexesStableCollection: 
> RangeReplaceableCollection {}
> 
> That would help us here.
> 
> -- 
> Brent Royal-Gordon
> Architechies
> 


I mentioned this much earlier in the thread. My preferred solution would be 
some kind of RRC-like protocol where mutating methods returned an associated 
“IndexDisplacement” type. That IndexDisplacement would store, for each 
operation, the offset and number of index-positions which have been 
inserted/removed, and know how to translate an index in the previous state in 
to one in the new state.

You would still need to manually adjust your stored indexes using that 
IndexDisplacement, but it’d be less error-prone as the logic is written for you.

The standard (non-IndexDisplacement-returning) RRC methods could then be 
implemented as wrappers which discard the displacement.

- Karl
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Brent Royal-Gordon via swift-evolution
> On Feb 10, 2017, at 5:49 PM, Jonathan Hull via swift-evolution 
>  wrote:
> 
> An easier to implement, but slightly less useful approach, would be to have 
> methods which take an array of indexes along with the proposed change, and 
> then it adjusts the indexes (or replaces them with nil if they are invalid) 
> as it makes the update.  For example:
> 
>   func append(_ element:Element, adjusting: [Index]) -> [Index?]
>   func appending(_ element:Element, adjusting: [Index]) -> (Self, 
> [Index?])

This is a very interesting idea. A couple observations:

1. The problem of adjusting indices is not just a String one. It also applies 
to Array, among other things.

2. This logic could be encapsulated and reused in a separate type. For 
instance, imagine:

let myStringProxy = IndexTracking(collection: myString, trackedIndices: 
[someIndex, otherIndex])
myStringProxy.insert("foo", at: otherIndex)
(someIndex, otherIndex) = (stringProxy.trackedIndices[0], 
stringProxy.trackedIndices[1])

Or, with a helper method:

myString.withTracked() { myStringProxy in
myStringProxy.insert("foo", at: otherIndex)
}

3. An obstacle to doing this correctly is that a collection's index 
invalidation behavior is not expressed in the type system. If there were a 
protocol like:

protocol RangeReplaceableWithEarlierIndexesStableCollection: 
RangeReplaceableCollection {}

That would help us here.

-- 
Brent Royal-Gordon
Architechies

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Jonathan Hull via swift-evolution
Ok, appending was a dumb example (It has been a long week).  Imagine the same 
idea with insert/remove…

Thanks,
Jon

> On Feb 10, 2017, at 5:49 PM, Jonathan Hull via swift-evolution 
>  wrote:
> 
> This is the biggest need I have from strings (and collections) that is not 
> being met, and is I think why people reach for integers.  I have a stored 
> index which points to something important, and if the string/collection is 
> edited, I now have to update the index to be correct.  Lots of chances to 
> screw up (e.g. off by 1 errors) if I am not super careful.
> 
> I would much rather have that dealt with by the string/collection itself, so 
> that I can think about my larger project instead of keeping everything in 
> sync.
> 
> My preferred design for this would be to have two types of index. An internal 
> index (what we have now) which is fast, efficient and transient, and a stable 
> index which will always point to the same item despite having added or 
> removed other items (or be testably invalid if the item pointed to has been 
> removed).  For strings, this means the stable index would point to the same 
> characters even if the string has been edited (as long as those characters 
> are still there).
> 
> I know the second isn’t useful for algorithms in the standard library, but it 
> is so useful for things like storing user selections… and it is very easy 
> to foot-gun when trying to do it yourself.  Keeping stored indexes in sync is 
> among my top annoyances while programming.
> 
> An easier to implement, but slightly less useful approach, would be to have 
> methods which take an array of indexes along with the proposed change, and 
> then it adjusts the indexes (or replaces them with nil if they are invalid) 
> as it makes the update.  For example:
> 
>   func append(_ element:Element, adjusting: [Index]) -> [Index?]
>   func appending(_ element:Element, adjusting: [Index]) -> (Self, 
> [Index?])
> 
> Thanks,
> Jon
> 
> 
>> On Feb 9, 2017, at 3:45 PM, Hooman Mehr via swift-evolution 
>> > wrote:
>> 
>> 
>>> On Feb 9, 2017, at 3:11 PM, Dave Abrahams >> > wrote:
>>> 
>>> 
>>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" >> > wrote:
>>> 
 Hello Shawn
 Just google with any programming language name and “string manipulation”
 and you have enough reading for a week or so :o)
 TedvG
>>> 
>>> That truly doesn't answer the question.  It's not, “why do people index
>>> strings with integers when that's the only tool they are given for
>>> decomposing strings?”  It's, “what do you have to do with strings that's
>>> hard in Swift *because* you can't index them with integers?”
>> 
>> I have done some string processing. I have not encountered any algorithm 
>> where an integer index is absolutely needed, but sometimes it might be the 
>> most convenient. 
>> 
>> For example, there are valid reasons to keep side tables that hold indexes 
>> into a string. (such as maintaining attributes that apply to a substring or 
>> things like pre-computed positions of soft line breaks). It does not require 
>> the index to be integer, but maintaining validity of those indexes after the 
>> string is mutated requires being able to offset them back or forth from some 
>> position on. These operations could be less verbose and easier if the index 
>> happens to be integer or (efficiently) supports + - operators. Also, I know 
>> there are other methods to deal with such things and mutating a large string 
>> generally is a bad idea, but sometimes it is the easiest and most convenient 
>> solution to the problem at hand.
>> 
>> 
>>> 
> On 9 Feb 2017, at 16:48, Shawn Erickson  > wrote:
> 
> I also wonder what folks are actually doing that require indexing
> into strings. I would love to see some real world examples of what
> and why indexing into a string is needed. Who is the end consumer of
> that string, etc.
> 
> Do folks have so examples?
> 
> -Shawn
> 
> On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
>  
> >> 
> wrote:
> Hello Hooman
> That invalidates my assumptions, thanks for evaluating
> it's more complex than I thought.
> Kind Regards
> Ted
> 
>> On 8 Feb 2017, at 00:07, Hooman Mehr >  >> 
>> wrote:
>> 
>> 
>>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
>>>  
>>> 

Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Jonathan Hull via swift-evolution
This is the biggest need I have from strings (and collections) that is not 
being met, and is I think why people reach for integers.  I have a stored index 
which points to something important, and if the string/collection is edited, I 
now have to update the index to be correct.  Lots of chances to screw up (e.g. 
off by 1 errors) if I am not super careful.

I would much rather have that dealt with by the string/collection itself, so 
that I can think about my larger project instead of keeping everything in sync.

My preferred design for this would be to have two types of index. An internal 
index (what we have now) which is fast, efficient and transient, and a stable 
index which will always point to the same item despite having added or removed 
other items (or be testably invalid if the item pointed to has been removed).  
For strings, this means the stable index would point to the same characters 
even if the string has been edited (as long as those characters are still 
there).

I know the second isn’t useful for algorithms in the standard library, but it 
is so useful for things like storing user selections… and it is very easy 
to foot-gun when trying to do it yourself.  Keeping stored indexes in sync is 
among my top annoyances while programming.

An easier to implement, but slightly less useful approach, would be to have 
methods which take an array of indexes along with the proposed change, and then 
it adjusts the indexes (or replaces them with nil if they are invalid) as it 
makes the update.  For example:

func append(_ element:Element, adjusting: [Index]) -> [Index?]
func appending(_ element:Element, adjusting: [Index]) -> (Self, 
[Index?])

Thanks,
Jon


> On Feb 9, 2017, at 3:45 PM, Hooman Mehr via swift-evolution 
>  wrote:
> 
> 
>> On Feb 9, 2017, at 3:11 PM, Dave Abrahams > > wrote:
>> 
>> 
>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" > > wrote:
>> 
>>> Hello Shawn
>>> Just google with any programming language name and “string manipulation”
>>> and you have enough reading for a week or so :o)
>>> TedvG
>> 
>> That truly doesn't answer the question.  It's not, “why do people index
>> strings with integers when that's the only tool they are given for
>> decomposing strings?”  It's, “what do you have to do with strings that's
>> hard in Swift *because* you can't index them with integers?”
> 
> I have done some string processing. I have not encountered any algorithm 
> where an integer index is absolutely needed, but sometimes it might be the 
> most convenient. 
> 
> For example, there are valid reasons to keep side tables that hold indexes 
> into a string. (such as maintaining attributes that apply to a substring or 
> things like pre-computed positions of soft line breaks). It does not require 
> the index to be integer, but maintaining validity of those indexes after the 
> string is mutated requires being able to offset them back or forth from some 
> position on. These operations could be less verbose and easier if the index 
> happens to be integer or (efficiently) supports + - operators. Also, I know 
> there are other methods to deal with such things and mutating a large string 
> generally is a bad idea, but sometimes it is the easiest and most convenient 
> solution to the problem at hand.
> 
> 
>> 
 On 9 Feb 2017, at 16:48, Shawn Erickson > wrote:
 
 I also wonder what folks are actually doing that require indexing
 into strings. I would love to see some real world examples of what
 and why indexing into a string is needed. Who is the end consumer of
 that string, etc.
 
 Do folks have so examples?
 
 -Shawn
 
 On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
  
 >> 
 wrote:
 Hello Hooman
 That invalidates my assumptions, thanks for evaluating
 it's more complex than I thought.
 Kind Regards
 Ted
 
> On 8 Feb 2017, at 00:07, Hooman Mehr   >> 
> wrote:
> 
> 
>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
>>  
>> >> 
>> wrote:
>> 
>> I now assume that:
>>  1. -= a “plain” Unicode character (codepoint?)  can result in one 
>> glyph.=-
> 
> What do you mean by “plain”? Characters in some Unicode scripts are
> by no means “plain”. They can affect (and be affected by) the
> characters around them, they can cause glyphs around them to
> rearrange or combine 

Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Ben Cohen via swift-evolution
Hi Ted,

Here’s a sketch of one way to handle this kind of processing without requiring 
integer indexing. Hopefully not too buggy though I haven’t tested it 
extensively :). 

Here I’m stashing the parsed values in a dictionary, but you could also write 
code to insert them into a proper data structure where the dictionary set is 
happening (or maybe stick with the dictionary build, and then use that 
dictionary to populate your data structure, along with some more data 
validation and error handling).

import Foundation
extension String: Collection { }

let fieldLengths: DictionaryLiteral = [
"CompanyName":30,
"PresidentLastName":15,
"PresidentFirstName":8,
"VPMarketingLastName":15,
"VPMarketingFirstName":8,
"AlternateContactTitle":10,
"AlternateContactLastName":15,
"AlternateContactFirstName":8,
"Address":15,
"City":15,
"State":2,
"Zip":5,
]

var data = "Premier PropertiesMurray Mitch   Ricky  
RomaOffice MgrWilliamson John350 Fifth Av   New York   NY10118"
var keyedRecord: [String:String] = [:]

for (key,length) in fieldLengths {
let field = data.prefix(length)

guard field.count == length
else { fatalError("Input too short while reading \(key)") }
// or however you want to handle it

keyedRecord[key] = field.trimmingCharacters(in: CharacterSet.whitespaces)

data = data.dropFirst(length)
}
guard data.isEmpty
else { fatalError("Input too long") }

print(keyedRecord)

I think it’s worth noting how seductive it is, with the integer indexing, to 
perform unchecked indexing into the data: recordStr[ 0..<30] is great until you 
have to process a corrupt record. Working in terms of higher-level APIs 
encourages handling of the failure cases. As an added bonus, when you upgrade 
your system and now the incoming data turns out to be utf8, your system doesn’t 
crash when a bored intern inserts some emoji into the president’s name.

There is still definitely room to make this easier/more discoverable for users:

- The “patterns” concept that is briefly touched on in the string manifesto 
would hopefully provide a another way of expressing this, with patterns 
matching fixed numbers of characters.
 - The need to walk over the field multiple times (first prefix, then count, 
then dropFirst) should be better-handled by some other scanning APIs mentioned 
in the manifesto e.g. if let field = data.dropPrefix(lengthPattern). Note that 
if the underlying String held only ASCII/Latin1, these should still be 
constant-time operations under the hood. 
- Another approach is to provide generic operations on Collection that chunks 
collections into subsequences of given lengths and serves them up, possibly via 
a a lazy view. This would have the advantage of not requiring mutable state in 
the loop.

But the above is what we can achieve with the tools we have today.

p.s. as someone who has worked in a bank with thousands of ancient file 
formats, no argument from me that COBOL rules :)

> On Feb 10, 2017, at 9:20 AM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> Please see in-line response below
>> On 10 Feb 2017, at 03:56, Shawn Erickson > > wrote:
>> 
>> 
>> On Thu, Feb 9, 2017 at 5:09 PM Ted F.A. van Gaalen > > wrote:
>>> On 10 Feb 2017, at 00:11, Dave Abrahams >> > wrote:
>>> 
>>> 
>>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" >> > wrote:
>>> 
 Hello Shawn
 Just google with any programming language name and “string manipulation”
 and you have enough reading for a week or so :o)
 TedvG
>>> 
>>> That truly doesn't answer the question.  It's not, “why do people index
>>> strings with integers when that's the only tool they are given for
>>> decomposing strings?”  It's, “what do you have to do with strings that's
>>> hard in Swift *because* you can't index them with integers?”
>> 
>> Hi Dave,
>> Ok. here are just a few examples: 
>> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13 
>> perhaps? 
>> of many of the typical combined article codes and product IDs that many 
>> factories and shops use? 
>> 
>> or: 
>> 
>> E.g. processing legacy files from IBM mainframes:
>> extract fields from ancient data records read from very old sequential files,
>> say, a product data record like this from a file from 1978 you’d have to 
>> unpack and process:   
>> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
>> into:
>> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM, 
>> Z3453.
>> product category, pcs, discount code, product code, price Yen, price $, 
>> class code, etc… 
>> in Cobol and PL/1 records are nearly always defined with a fixed field 
>> layout like this.:
>> (storage was limited and very, very expensive, e.g. 

Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Hooman Mehr via swift-evolution

> On Feb 9, 2017, at 6:50 PM, Shawn Erickson  wrote:
> 
> 
> 
> On Thu, Feb 9, 2017 at 3:45 PM Hooman Mehr  > wrote:
>> On Feb 9, 2017, at 3:11 PM, Dave Abrahams > > wrote:
>> 
>> 
>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" > > wrote:
>> 
>>> Hello Shawn
>>> Just google with any programming language name and “string manipulation”
>>> and you have enough reading for a week or so :o)
>>> TedvG
>> 
>> That truly doesn't answer the question.  It's not, “why do people index
>> strings with integers when that's the only tool they are given for
>> decomposing strings?”  It's, “what do you have to do with strings that's
>> hard in Swift *because* you can't index them with integers?”
> 
> I have done some string processing. I have not encountered any algorithm 
> where an integer index is absolutely needed, but sometimes it might be the 
> most convenient. 
> 
> For example, there are valid reasons to keep side tables that hold indexes 
> into a string. (such as maintaining attributes that apply to a substring or 
> things like pre-computed positions of soft line breaks). It does not require 
> the index to be integer, but maintaining validity of those indexes after the 
> string is mutated requires being able to offset them back or forth from some 
> position on. These operations could be less verbose and easier if the index 
> happens to be integer or (efficiently) supports + - operators. Also, I know 
> there are other methods to deal with such things and mutating a large string 
> generally is a bad idea, but sometimes it is the easiest and most convenient 
> solution to the problem at hand.
> 
>  The end goal of this string is for human consumption right? So such 
> manipulation would need need to unicode aware in the modern world? ..or is it 
> for some other reason?
> 
> -Shawn

For an example of what I mean, see the source code of 
NS(Mutable)AttributedString 

 and note how most of the mutating methods of Mutable variant are not 
implemented yet...

So, a good example of where such indexing would be convenient, could be writing 
a swift-native AttributedString backed by Swift native String.


___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Ted F.A. van Gaalen via swift-evolution
Please see in-line response below
> On 10 Feb 2017, at 03:56, Shawn Erickson  wrote:
> 
> 
> On Thu, Feb 9, 2017 at 5:09 PM Ted F.A. van Gaalen  > wrote:
>> On 10 Feb 2017, at 00:11, Dave Abrahams > > wrote:
>> 
>> 
>> on Thu Feb 09 2017, "Ted F.A. van Gaalen" > > wrote:
>> 
>>> Hello Shawn
>>> Just google with any programming language name and “string manipulation”
>>> and you have enough reading for a week or so :o)
>>> TedvG
>> 
>> That truly doesn't answer the question.  It's not, “why do people index
>> strings with integers when that's the only tool they are given for
>> decomposing strings?”  It's, “what do you have to do with strings that's
>> hard in Swift *because* you can't index them with integers?”
> 
> Hi Dave,
> Ok. here are just a few examples: 
> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13 
> perhaps? 
> of many of the typical combined article codes and product IDs that many 
> factories and shops use? 
> 
> or: 
> 
> E.g. processing legacy files from IBM mainframes:
> extract fields from ancient data records read from very old sequential files,
> say, a product data record like this from a file from 1978 you’d have to 
> unpack and process:   
> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
> into:
> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM, 
> Z3453.
> product category, pcs, discount code, product code, price Yen, price $, class 
> code, etc… 
> in Cobol and PL/1 records are nearly always defined with a fixed field layout 
> like this.:
> (storage was limited and very, very expensive, e.g. XML would be regarded as 
> a 
> "scandalous waste" even the commas in CSV files! ) 
> 
> 01  MAILING-RECORD.
>05  COMPANY-NAMEPIC X(30).
>05  CONTACTS.
>10  PRESIDENT.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  VP-MARKETING.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  ALTERNATE-CONTACT.
>15  TITLE   PIC X(10).
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>05  ADDRESS PIC X(15).
>05  CITYPIC X(15).
>05  STATE   PIC XX.
>05  ZIP PIC 9(5).
> 
> These are all character data fields here, except for the numeric ZIP field , 
> however in Cobol it can be treated like character data. 
> So here I am, having to get the data of these old Cobol production files
> into a brand new Swift based accounting system of 2017, what can I do?   
> 
> How do I unpack these records and being the data into a Swift structure or 
> class? 
> (In Cobol I don’t have to because of the predefined fixed format record 
> layout).
> 
> AFAIK there are no similar record structures with fixed fields like this 
> available Swift?
> 
> So, the only way I can think of right now is to do it like this:
> 
> // mailingRecord is a Swift structure
> struct MailingRecord
> {
> var  companyName: String = “no Name”
>  var contacts: CompanyContacts
>  .
>  etc.. 
> }
> 
> // recordStr was read here with ASCII encoding
> 
> // unpack data in to structure’s properties, in this case all are Strings
> mailingRecord.companyName   = recordStr[ 0..<30]
> mailingRecord.contacts.president.lastName  = recordStr[30..<45]
> mailingRecord.contacts.president.firstName = recordStr[45..<53]
> 
> 
> // and so on..
> 
> Ever worked for e.g. a bank with thousands of these files unchanged formats 
> for years?
> 
> Any alternative, convenient en simpler methods in Swift present? 
> These looks like examples of fix data format
Hi Shawn,
No, it could also be an UTF-8 String.
  
> that could be parsed from a byte buffer into strings, etc.
How would you do that? could you please provide an example how to do this, with 
a byte buffer? 
eg. read from flat ascii file —> unpack fields —> store in structure props? 


> Likely little need to force them via a higher order string concept,
What do you mean here with “high order string concept” ??
Swift is a high level language, I expect to do this with Strings directly,
instead of being forced to use low-level coding with byte arrays etc.
(I have/want no time for that)
Surely, one doesn’t have to resort to that in a high level language like Swift? 
If I am certain that all characters in a file etc. are of fixed width, even in 
UTF-32
(in the above example I am 100% sure of that) then 
using  str[n1..

Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread Ted F.A. van Gaalen via swift-evolution
Hello Shawn
Just google with any programming language name and “string manipulation”
and you have enough reading for a week or so :o)
TedvG


> On 9 Feb 2017, at 16:48, Shawn Erickson  wrote:
> 
> I also wonder what folks are actually doing that require indexing into 
> strings. I would love to see some real world examples of what and why 
> indexing into a string is needed. Who is the end consumer of that string, etc.
> 
> Do folks have so examples?
> 
> -Shawn
> 
> On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
> > wrote:
> Hello Hooman
> That invalidates my assumptions, thanks for evaluating
> it's more complex than I thought.
> Kind Regards
> Ted
> 
>> On 8 Feb 2017, at 00:07, Hooman Mehr > > wrote:
>> 
>> 
>>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
>>> > wrote:
>>> 
>>> I now assume that:
>>>   1. -= a “plain” Unicode character (codepoint?)  can result in one 
>>> glyph.=-
>> 
>> What do you mean by “plain”? Characters in some Unicode scripts are by no 
>> means “plain”. They can affect (and be affected by) the characters around 
>> them, they can cause glyphs around them to rearrange or combine (like 
>> ligatures) or their visual representation (glyph) may float in the same 
>> space as an adjacent glyph (and seem to be part of the “host” glyph), etc. 
>> So, the general relationship of a character and its corresponding glyph (if 
>> there is one) is complex and depends on context and surroundings characters.
>> 
>>>   2. -= a  grapheme cluster always results in just a single glyph, 
>>> true? =- 
>> 
>> False
>> 
>>>   3. The only thing that I can see on screen or print are glyphs 
>>> (“carvings”,visual elements that stand on their own )
>> 
>> The visible effect might not be a visual shape. It may be for example, the 
>> way the surrounding shapes change or re-arrange.
>> 
>>>  4.  In this context, a glyph is a humanly recognisable visual form of 
>>> a character,
>> 
>> Not in a straightforward one to one fashion, not even in Latin / Roman 
>> script.
>> 
>>>  5. On this level (the glyph, what I can see as a user) it is not 
>>> relevant and also not detectable
>>>  with how many Unicode scalars (codepoints ?), grapheme, or even on 
>>> what kind
>>>  of encoding the glyph was based upon.
>> 
>> False
>> 
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org 
> https://lists.swift.org/mailman/listinfo/swift-evolution 
> 

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-10 Thread James Froggatt via swift-evolution
Regarding the string-collection question, I have some corner-cases which may be 
worth consideration:

Is it safe to pass a view (EG: String.UTF8View) to a generic 
function/initialiser over Collections, which either has an inout parameter or 
stores the result?

I could imagine these situations presenting problems. Though obviously we can't 
eliminate views entirely, it could at least save an Array conversion for one 
kind of indexing method.
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Shawn Erickson via swift-evolution
On Thu, Feb 9, 2017 at 5:09 PM Ted F.A. van Gaalen 
wrote:

> On 10 Feb 2017, at 00:11, Dave Abrahams  wrote:
>
>
> on Thu Feb 09 2017, "Ted F.A. van Gaalen" 
> wrote:
>
> Hello Shawn
> Just google with any programming language name and “string manipulation”
> and you have enough reading for a week or so :o)
> TedvG
>
>
> That truly doesn't answer the question.  It's not, “why do people index
> strings with integers when that's the only tool they are given for
> decomposing strings?”  It's, “what do you have to do with strings that's
> hard in Swift *because* you can't index them with integers?”
>
>
> Hi Dave,
> Ok. here are just a few examples:
> Parsing and validating an ISBN code? or a (freight) container ID? or EAN13
> perhaps?
> of many of the typical combined article codes and product IDs that many
> factories and shops use?
>
> or:
>
> E.g. processing legacy files from IBM mainframes:
> extract fields from ancient data records read from very old sequential
> files,
> say, a product data record like this from a file from 1978 you’d have to
> unpack and process:
> 123534-09EVXD4568,991234,89ABCYELLOW12AGRAINESYTEMZ3453
> into:
> 123, 534, -09, EVXD45, 68,99, 1234,99, ABC, YELLOW, 12A, GRAIN, ESYSTEM,
> Z3453.
> product category, pcs, discount code, product code, price Yen, price $,
> class code, etc…
> in Cobol and PL/1 records are nearly always defined with a fixed field
> layout like this.:
> (storage was limited and very, very expensive, e.g. XML would be regarded
> as a
> "scandalous waste" even the commas in CSV files! )
>
> 01  MAILING-RECORD.
>
>05  COMPANY-NAMEPIC X(30).
>05  CONTACTS.
>10  PRESIDENT.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  VP-MARKETING.
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>10  ALTERNATE-CONTACT.
>15  TITLE   PIC X(10).
>15  LAST-NAME   PIC X(15).
>15  FIRST-NAME  PIC X(8).
>05  ADDRESS PIC X(15).
>05  CITYPIC X(15).
>05  STATE   PIC XX.
>05  ZIP PIC 9(5).
>
> These are all character data fields here, except for the numeric ZIP field , 
> however in Cobol it can be treated like character data.
> So here I am, having to get the data of these old Cobol production files
> into a brand new Swift based accounting system of 2017, what can I do?
>
> How do I unpack these records and being the data into a Swift structure or 
> class?
> (In Cobol I don’t have to because of the predefined fixed format record 
> layout).
>
> AFAIK there are no similar record structures with fixed fields like this 
> available Swift?
>
> So, the only way I can think of right now is to do it like this:
>
> // mailingRecord is a Swift structure
> struct MailingRecord
> {
> var  companyName: String = “no Name”
>  var contacts: CompanyContacts
>  .
>  etc..
> }
>
> // recordStr was read here with ASCII encoding
>
> // unpack data in to structure’s properties, in this case all are Strings
> mailingRecord.companyName   = recordStr[ 0..<30]
> mailingRecord.contacts.president.lastName  = recordStr[30..<45]
> mailingRecord.contacts.president.firstName = recordStr[45..<53]
>
>
> // and so on..
>
> Ever worked for e.g. a bank with thousands of these files unchanged formats 
> for years?
>
> Any alternative, convenient en simpler methods in Swift present?
>
> These looks like examples of fix data format that could be parsed from a
byte buffer into strings, etc. Likely little need to force them via a
higher order string concept, at least not until unpacked from its compact
byte form.

-Shawn
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Shawn Erickson via swift-evolution
On Thu, Feb 9, 2017 at 3:45 PM Hooman Mehr  wrote:

> On Feb 9, 2017, at 3:11 PM, Dave Abrahams  wrote:
>
>
> on Thu Feb 09 2017, "Ted F.A. van Gaalen" 
> wrote:
>
> Hello Shawn
> Just google with any programming language name and “string manipulation”
> and you have enough reading for a week or so :o)
> TedvG
>
>
> That truly doesn't answer the question.  It's not, “why do people index
> strings with integers when that's the only tool they are given for
> decomposing strings?”  It's, “what do you have to do with strings that's
> hard in Swift *because* you can't index them with integers?”
>
>
> I have done some string processing. I have not encountered any algorithm
> where an integer index is absolutely needed, but sometimes it might be the
> most convenient.
>
> For example, there are valid reasons to keep side tables that hold indexes
> into a string. (such as maintaining attributes that apply to a substring or
> things like pre-computed positions of soft line breaks). It does not
> require the index to be *integer*, but maintaining validity of those
> indexes after the string is mutated requires being able to offset them back
> or forth from some position on. These operations could be less verbose and
> easier if the index happens to be integer or (efficiently) supports + -
> operators. Also, I know there are other methods to deal with such things
> and mutating a large string generally is a bad idea, but sometimes it is
> the easiest and most convenient solution to the problem at hand.
>

 The end goal of this string is for human consumption right? So such
manipulation would need need to unicode aware in the modern world? ..or is
it for some other reason?

-Shawn
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Hooman Mehr via swift-evolution

> On Feb 9, 2017, at 3:11 PM, Dave Abrahams  wrote:
> 
> 
> on Thu Feb 09 2017, "Ted F.A. van Gaalen"  wrote:
> 
>> Hello Shawn
>> Just google with any programming language name and “string manipulation”
>> and you have enough reading for a week or so :o)
>> TedvG
> 
> That truly doesn't answer the question.  It's not, “why do people index
> strings with integers when that's the only tool they are given for
> decomposing strings?”  It's, “what do you have to do with strings that's
> hard in Swift *because* you can't index them with integers?”

I have done some string processing. I have not encountered any algorithm where 
an integer index is absolutely needed, but sometimes it might be the most 
convenient. 

For example, there are valid reasons to keep side tables that hold indexes into 
a string. (such as maintaining attributes that apply to a substring or things 
like pre-computed positions of soft line breaks). It does not require the index 
to be integer, but maintaining validity of those indexes after the string is 
mutated requires being able to offset them back or forth from some position on. 
These operations could be less verbose and easier if the index happens to be 
integer or (efficiently) supports + - operators. Also, I know there are other 
methods to deal with such things and mutating a large string generally is a bad 
idea, but sometimes it is the easiest and most convenient solution to the 
problem at hand.


> 
>>> On 9 Feb 2017, at 16:48, Shawn Erickson  wrote:
>>> 
>>> I also wonder what folks are actually doing that require indexing
>>> into strings. I would love to see some real world examples of what
>>> and why indexing into a string is needed. Who is the end consumer of
>>> that string, etc.
>>> 
>>> Do folks have so examples?
>>> 
>>> -Shawn
>>> 
>>> On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
>>> > wrote:
>>> Hello Hooman
>>> That invalidates my assumptions, thanks for evaluating
>>> it's more complex than I thought.
>>> Kind Regards
>>> Ted
>>> 
 On 8 Feb 2017, at 00:07, Hooman Mehr > wrote:
 
 
> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
> > wrote:
> 
> I now assume that:
>  1. -= a “plain” Unicode character (codepoint?)  can result in one 
> glyph.=-
 
 What do you mean by “plain”? Characters in some Unicode scripts are
 by no means “plain”. They can affect (and be affected by) the
 characters around them, they can cause glyphs around them to
 rearrange or combine (like ligatures) or their visual
 representation (glyph) may float in the same space as an adjacent
 glyph (and seem to be part of the “host” glyph), etc. So, the
 general relationship of a character and its corresponding glyph (if
 there is one) is complex and depends on context and surroundings
 characters.
 
>  2. -= a  grapheme cluster always results in just a single glyph, 
> true? =- 
 
 False
 
>  3. The only thing that I can see on screen or print are glyphs 
> (“carvings”,visual elements that stand on their own )
 
 The visible effect might not be a visual shape. It may be for example, the 
 way the surrounding shapes change or re-arrange.
 
> 4.  In this context, a glyph is a humanly recognisable visual form of 
> a character,
 
 Not in a straightforward one to one fashion, not even in Latin / Roman 
 script.
 
> 5. On this level (the glyph, what I can see as a user) it is not 
> relevant and also not detectable
> with how many Unicode scalars (codepoints ?), grapheme, or even 
> on what kind
> of encoding the glyph was based upon.
 
 False
 
>>> 
>>> ___
>>> swift-evolution mailing list
>>> swift-evolution@swift.org 
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>> 
>> 
> 
> -- 
> -Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Dave Abrahams via swift-evolution

on Thu Feb 09 2017, "Ted F.A. van Gaalen"  wrote:

> Hello Shawn
> Just google with any programming language name and “string manipulation”
> and you have enough reading for a week or so :o)
> TedvG

That truly doesn't answer the question.  It's not, “why do people index
strings with integers when that's the only tool they are given for
decomposing strings?”  It's, “what do you have to do with strings that's
hard in Swift *because* you can't index them with integers?”

>> On 9 Feb 2017, at 16:48, Shawn Erickson  wrote:
>> 
>> I also wonder what folks are actually doing that require indexing
>> into strings. I would love to see some real world examples of what
>> and why indexing into a string is needed. Who is the end consumer of
>> that string, etc.
>> 
>> Do folks have so examples?
>> 
>> -Shawn
>> 
>> On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution 
>> > wrote:
>> Hello Hooman
>> That invalidates my assumptions, thanks for evaluating
>> it's more complex than I thought.
>> Kind Regards
>> Ted
>> 
>>> On 8 Feb 2017, at 00:07, Hooman Mehr >> > wrote:
>>> 
>>> 
 On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
 > wrote:
 
 I now assume that:
   1. -= a “plain” Unicode character (codepoint?)  can result in one 
 glyph.=-
>>> 
>>> What do you mean by “plain”? Characters in some Unicode scripts are
>>> by no means “plain”. They can affect (and be affected by) the
>>> characters around them, they can cause glyphs around them to
>>> rearrange or combine (like ligatures) or their visual
>>> representation (glyph) may float in the same space as an adjacent
>>> glyph (and seem to be part of the “host” glyph), etc. So, the
>>> general relationship of a character and its corresponding glyph (if
>>> there is one) is complex and depends on context and surroundings
>>> characters.
>>> 
   2. -= a  grapheme cluster always results in just a single glyph, 
 true? =- 
>>> 
>>> False
>>> 
   3. The only thing that I can see on screen or print are glyphs 
 (“carvings”,visual elements that stand on their own )
>>> 
>>> The visible effect might not be a visual shape. It may be for example, the 
>>> way the surrounding shapes change or re-arrange.
>>> 
  4.  In this context, a glyph is a humanly recognisable visual form of 
 a character,
>>> 
>>> Not in a straightforward one to one fashion, not even in Latin / Roman 
>>> script.
>>> 
  5. On this level (the glyph, what I can see as a user) it is not 
 relevant and also not detectable
  with how many Unicode scalars (codepoints ?), grapheme, or even 
 on what kind
  of encoding the glyph was based upon.
>>> 
>>> False
>>> 
>> 
>> ___
>> swift-evolution mailing list
>> swift-evolution@swift.org 
>> https://lists.swift.org/mailman/listinfo/swift-evolution
> 
>

-- 
-Dave
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Ben Cohen via swift-evolution

> On Feb 9, 2017, at 7:48 AM, Shawn Erickson via swift-evolution 
>  wrote:
> 
> I also wonder what folks are actually doing that require indexing into 
> strings. I would love to see some real world examples of what and why 
> indexing into a string is needed. Who is the end consumer of that string, etc.
> 
> Do folks have so examples?

Big +1 for this. Real world use cases that are hard today would be great to see 
so we can make sure they are accounted for in the new API design.

The ideal situation is that there is a common pattern with many of them that 
can be accommodated through useful higher-level methods (maybe even on 
Collection) that avoid the need to mess around with indices.


___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Shawn Erickson via swift-evolution
I also wonder what folks are actually doing that require indexing into
strings. I would love to see some real world examples of what and why
indexing into a string is needed. Who is the end consumer of that string,
etc.

Do folks have so examples?

-Shawn

On Thu, Feb 9, 2017 at 6:56 AM Ted F.A. van Gaalen via swift-evolution <
swift-evolution@swift.org> wrote:

> Hello Hooman
> That invalidates my assumptions, thanks for evaluating
> it's more complex than I thought.
> Kind Regards
> Ted
>
> On 8 Feb 2017, at 00:07, Hooman Mehr  wrote:
>
>
> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution <
> swift-evolution@swift.org> wrote:
>
> I now assume that:
>   1. -= a “plain” Unicode character (codepoint?)  can result in one
> glyph.=-
>
>
> What do you mean by “plain”? Characters in some Unicode scripts are by no
> means “plain”. They can affect (and be affected by) the characters around
> them, they can cause glyphs around them to rearrange or combine (like
> ligatures) or their visual representation (glyph) may float in the same
> space as an adjacent glyph (and seem to be part of the “host” glyph), etc.
> So, the general relationship of a character and its corresponding glyph (if
> there is one) is complex and depends on context and surroundings characters.
>
>   2. -= a  grapheme cluster always results in just a single glyph,
> true? =-
>
>
> False
>
>   3. The only thing that I can see on screen or print are glyphs
> (“carvings”,visual elements that stand on their own )
>
>
> The visible effect might not be a visual shape. It may be for example, the
> way the surrounding shapes change or re-arrange.
>
>  4.  In this context, a glyph is a humanly recognisable visual form of
> a character,
>
>
> Not in a straightforward one to one fashion, not even in Latin / Roman
> script.
>
>  5. On this level (the glyph, what I can see as a user) it is not
> relevant and also not detectable
>  with how many Unicode scalars (codepoints ?), grapheme, or even
> on what kind
>  of encoding the glyph was based upon.
>
>
> False
>
>
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-09 Thread Ted F.A. van Gaalen via swift-evolution
Hello Hooman
That invalidates my assumptions, thanks for evaluating
it's more complex than I thought.
Kind Regards
Ted

> On 8 Feb 2017, at 00:07, Hooman Mehr  wrote:
> 
> 
>> On Feb 7, 2017, at 12:19 PM, Ted F.A. van Gaalen via swift-evolution 
>> > wrote:
>> 
>> I now assume that:
>>   1. -= a “plain” Unicode character (codepoint?)  can result in one 
>> glyph.=-
> 
> What do you mean by “plain”? Characters in some Unicode scripts are by no 
> means “plain”. They can affect (and be affected by) the characters around 
> them, they can cause glyphs around them to rearrange or combine (like 
> ligatures) or their visual representation (glyph) may float in the same space 
> as an adjacent glyph (and seem to be part of the “host” glyph), etc. So, the 
> general relationship of a character and its corresponding glyph (if there is 
> one) is complex and depends on context and surroundings characters.
> 
>>   2. -= a  grapheme cluster always results in just a single glyph, true? 
>> =- 
> 
> False
> 
>>   3. The only thing that I can see on screen or print are glyphs 
>> (“carvings”,visual elements that stand on their own )
> 
> The visible effect might not be a visual shape. It may be for example, the 
> way the surrounding shapes change or re-arrange.
> 
>>  4.  In this context, a glyph is a humanly recognisable visual form of a 
>> character,
> 
> Not in a straightforward one to one fashion, not even in Latin / Roman script.
> 
>>  5. On this level (the glyph, what I can see as a user) it is not 
>> relevant and also not detectable
>>  with how many Unicode scalars (codepoints ?), grapheme, or even on 
>> what kind
>>  of encoding the glyph was based upon.
> 
> False
> 

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-07 Thread Ted F.A. van Gaalen via swift-evolution

> On 7 Feb 2017, at 19:44, Dave Abrahams  wrote:
> 
> 
> on Tue Feb 07 2017, "Ted F.A. van Gaalen"  wrote:
> 
>>> On 7 Feb 2017, at 05:42, Karl Wagner  wrote:
>>> 
 
 On 6 Feb 2017, at 19:29, Ted F.A. van Gaalen via swift-evolution 
 > > wrote:
 
>>> When it comes to fast access what’s most important is cache
>>> locality. DRAM is like 200x slower than L2 cache. Looping through
>>> some contiguous 16-bit integers is always going to beat the pants
>>> out of derefencing pointers.
>> 
>>> 
>> Hi Karl
>> That is of course hardware/processor dependent…and Swift runs on different 
>> target systems… isn’t? 
> 
> Actually the basic calculus holds for any modern processor.
> 
>>> It’s quite rare that you need to grab arbitrary parts of a String
>>> without knowing what is inside it. If you’re saying str[12..<34] -
>>> why 12, and why 34? Is 12 the length of some substring you know from
>>> earlier? In that case, you could find out how many CodeUnits it had,
>>> and use that information instead.
>>> For this example, I have used constants here, but normally these would be 
>>> variables..
>>> 
>> 
>> I’d say it is not so rare, these things are often used for all kinds of 
>> string parsing, there are
>> many
>> examples to be found on the Internet.
>> TedvG
> 
> That proves nothing, though.  The fact that people are using integers to
> do this doesn't mean you need to use them, nor does it mean that you'll
> get the right results from doing so.  Typically examples that use
> integer constants with strings are wrong for some large proportion of
> unicode text.
> 
  This is all a bit confusing.  
in https://en.wiktionary.org/wiki/glyph
   Definition of a glyph in our context: 
(typography, computing) A visual representation of a letter 
, character 
, or symbol 
, in a specific font 
 and style 
.

I now assume that:
  1. -= a “plain” Unicode character (codepoint?)  can result in one glyph.=-
  2. -= a  grapheme cluster always results in just a single glyph, true? =- 
  3. The only thing that I can see on screen or print are glyphs 
(“carvings”,visual elements that stand on their own )
 4.  In this context, a glyph is a humanly recognisable visual form of a 
character,
 5. On this level (the glyph, what I can see as a user) it is not relevant 
and also not detectable
 with how many Unicode scalars (codepoints ?), grapheme, or even on 
what kind
 of encoding the glyph was based upon.

 is this correct? (especially 1 and 2) 

Based on these assumptions, to me then, the definition of a character == glyph.
Therefore, my working model: I see a row of characters as a row of glyphs,
which are discrete autonomous visual elements, ergo: 
Each element is individually addressable with integers (ordinal)

?

TedvG



   

> -- 
> -Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-07 Thread Ted F.A. van Gaalen via swift-evolution
Thanks ..seeing now all this being heavily intertwined with external libs ICU 
etc.
then yes, too much effort for too little (making fixed width Unicode strings)
Why am i doing this?  Unicode is a wasp nest, how do you survive, Dave ? :o) 

But I do use “random string access" e.g. extracting substrings
with e.g.  let part = str[3..<6]  
with the help of the aforementioned String extension..

arrgh, great, make me a tea...

TedvG

> On 6 Feb 2017, at 23:25, Dave Abrahams  wrote:
> 
> 
> on Mon Feb 06 2017, David Waite  wrote:
> 
>>> On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
>>> 
>> wrote:
>>> 
>>> Hi Dave,
>>> Oops! yes, you’re right!
>>> I did read again more thoroughly about Unicode 
>> 
>>> and how Unicode is handled within Swift...
>>> -should have done that before I write something- sorry.  
>>> 
>>> Nevertheless: 
>>> 
>>> How about this solution:  (if I am not making other omissions in my 
>>> thinking again) 
>>> -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
>>> anyway.
>>> -however, if the Unicode character is a grapheme cluster (2..n Unicode 
>>> characters),then 
>>> store a pointer to a hidden child string containing the actual grapheme 
>>> cluster, like so:
>>> 
>>> 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
>>>|
>>>   |
>>> 2:   [UTF32, UTF32]  [UTF32, 
>>> UTF32, UTF32, ...]
>>> 
>>> whereby (1) is aString as seen by the programmer.
>>> and (2)  are hidden child strings, each containing a grapheme cluster. 
>> 
>> The random access would require a uniform layout, so a pointer and
>> scalar would need to be the same size. The above would work with a 32
>> bit platform with a tagged pointer, but would require a 64-bit slot
>> for pointers on 64-bit systems like macOS and iOS.
> 
> It would also make String not efficiently interoperable with almost any
> other system that processes strings including Foundation and ICU.
> 
>> Today when I need to do random access into a string, I convert it to
>> an Array. Hardly efficient memory-wise, but efficient
>> enough for random access.
> 
> I'd be willing to bet almost anything that you  never actually need to
> do random access into a String ;-)
> 
> -- 
> -Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-07 Thread Ted F.A. van Gaalen via swift-evolution

> On 7 Feb 2017, at 05:42, Karl Wagner  wrote:
> 
>> 
>> On 6 Feb 2017, at 19:29, Ted F.A. van Gaalen via swift-evolution 
>> > wrote:
>> 
>>> 
>>> On 6 Feb 2017, at 19:10, David Waite >> > wrote:
>>> 
 
 On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
 > wrote:
 
 Hi Dave,
 Oops! yes, you’re right!
 I did read again more thoroughly about Unicode 
 and how Unicode is handled within Swift...
 -should have done that before I write something- sorry.  
 
 Nevertheless: 
 
 How about this solution:  (if I am not making other omissions in my 
 thinking again) 
 -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
 anyway.
 -however, if the Unicode character is a grapheme cluster (2..n Unicode 
 characters),then 
 store a pointer to a hidden child string containing the actual grapheme 
 cluster, like so:
 
 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
|   
|
 2:   [UTF32, UTF32]  [UTF32, 
 UTF32, UTF32, ...]
 
 whereby (1) is aString as seen by the programmer.
 and (2)  are hidden child strings, each containing a grapheme cluster. 
>>> 
>>> The random access would require a uniform layout, so a pointer and scalar 
>>> would need to be the same size. The above would work with a 32 bit platform 
>>> with a tagged pointer, but would require a 64-bit slot for pointers on 
>>> 64-bit systems like macOS and iOS.
>>> 
>> Yeah, I know that,  but the “grapheme cluster pool” I am imagining 
>> could be allocated at a certain predefined base address, 
>> whereby the pointer I am referring to is just an offset from this base 
>> address. 
>> If so, an address space of  2^30  (1,073,741,824) 1 GB, will be available,
>> which is more than sufficient for just storing unique grapheme clusters..
>> (of course, not taking in account other allocations and app limitations) 
> 
> When it comes to fast access what’s most important is cache locality. DRAM is 
> like 200x slower than L2 cache. Looping through some contiguous 16-bit 
> integers is always going to beat the pants out of derefencing pointers.
Hi Karl
That is of course hardware/processor dependent…and Swift runs on different 
target systems… isn’t? 
> 
>>   
>>> Today when I need to do random access into a string, I convert it to an 
>>> Array. Hardly efficient memory-wise, but efficient enough for 
>>> random access.
>>> 
>> As a programmer. I just want to use String as-is but with  direct 
>> subscripting like str[12..<34]
>> and, if possible also with open range like so: str[12…]   
>> implemented natively in Swift. 
>> 
>> Kind Regards
>> TedvG
>> www.tedvg.com 
>> www.ravelnotes.com 
>>  
>>> -DW
>> 
>> ___
>> swift-evolution mailing list
>> swift-evolution@swift.org 
>> https://lists.swift.org/mailman/listinfo/swift-evolution 
>> 
> 
> 
> It’s quite rare that you need to grab arbitrary parts of a String without 
> knowing what is inside it. If you’re saying str[12..<34] - why 12, and why 
> 34? Is 12 the length of some substring you know from earlier? In that case, 
> you could find out how many CodeUnits it had, and use that information 
> instead.
For this example, I have used constants here, but normally these would be 
variables..
> 

I’d say it is not so rare, these things are often used for all kinds of string 
parsing, there are many
examples to be found on the Internet.
TedvG
> The new model will give you some form of efficient “random” access; the catch 
> is that it’s not totally random. Looking for the next character boundary is 
> necessarily linear, so the trick for large strings (>16K) is to make sure you 
> remember the CodeUnit offsets of important character boundaries.
> 
> - Karl

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread Dave Abrahams via swift-evolution

on Mon Feb 06 2017, David Waite  wrote:

>> On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
>> 
> wrote:
>> 
>> Hi Dave,
>> Oops! yes, you’re right!
>> I did read again more thoroughly about Unicode 
>
>> and how Unicode is handled within Swift...
>> -should have done that before I write something- sorry.  
>> 
>> Nevertheless: 
>> 
>> How about this solution:  (if I am not making other omissions in my thinking 
>> again) 
>> -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
>> anyway.
>> -however, if the Unicode character is a grapheme cluster (2..n Unicode 
>> characters),then 
>> store a pointer to a hidden child string containing the actual grapheme 
>> cluster, like so:
>> 
>> 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
>> |
>>   |
>> 2:   [UTF32, UTF32]  [UTF32, 
>> UTF32, UTF32, ...]
>> 
>> whereby (1) is aString as seen by the programmer.
>> and (2)  are hidden child strings, each containing a grapheme cluster. 
>
> The random access would require a uniform layout, so a pointer and
> scalar would need to be the same size. The above would work with a 32
> bit platform with a tagged pointer, but would require a 64-bit slot
> for pointers on 64-bit systems like macOS and iOS.

It would also make String not efficiently interoperable with almost any
other system that processes strings including Foundation and ICU.

> Today when I need to do random access into a string, I convert it to
> an Array. Hardly efficient memory-wise, but efficient
> enough for random access.

I'd be willing to bet almost anything that you  never actually need to
do random access into a String ;-)

-- 
-Dave
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread David Waite via swift-evolution

> On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> Hi Dave,
> Oops! yes, you’re right!
> I did read again more thoroughly about Unicode 
> and how Unicode is handled within Swift...
> -should have done that before I write something- sorry.  
> 
> Nevertheless: 
> 
> How about this solution:  (if I am not making other omissions in my thinking 
> again) 
> -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
> anyway.
> -however, if the Unicode character is a grapheme cluster (2..n Unicode 
> characters),then 
> store a pointer to a hidden child string containing the actual grapheme 
> cluster, like so:
> 
> 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
> | 
>  |
> 2:   [UTF32, UTF32]  [UTF32, 
> UTF32, UTF32, ...]
> 
> whereby (1) is aString as seen by the programmer.
> and (2)  are hidden child strings, each containing a grapheme cluster. 

The random access would require a uniform layout, so a pointer and scalar would 
need to be the same size. The above would work with a 32 bit platform with a 
tagged pointer, but would require a 64-bit slot for pointers on 64-bit systems 
like macOS and iOS.

Today when I need to do random access into a string, I convert it to an 
Array. Hardly efficient memory-wise, but efficient enough for random 
access.

-DW
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread Ted F.A. van Gaalen via swift-evolution

> On 6 Feb 2017, at 19:10, David Waite  wrote:
> 
>> 
>> On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
>>  wrote:
>> 
>> Hi Dave,
>> Oops! yes, you’re right!
>> I did read again more thoroughly about Unicode 
>> and how Unicode is handled within Swift...
>> -should have done that before I write something- sorry.  
>> 
>> Nevertheless: 
>> 
>> How about this solution:  (if I am not making other omissions in my thinking 
>> again) 
>> -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
>> anyway.
>> -however, if the Unicode character is a grapheme cluster (2..n Unicode 
>> characters),then 
>> store a pointer to a hidden child string containing the actual grapheme 
>> cluster, like so:
>> 
>> 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
>>| 
>>  |
>> 2:   [UTF32, UTF32]  [UTF32, 
>> UTF32, UTF32, ...]
>> 
>> whereby (1) is aString as seen by the programmer.
>> and (2)  are hidden child strings, each containing a grapheme cluster. 
> 
> The random access would require a uniform layout, so a pointer and scalar 
> would need to be the same size. The above would work with a 32 bit platform 
> with a tagged pointer, but would require a 64-bit slot for pointers on 64-bit 
> systems like macOS and iOS.
> 
Yeah, I know that,  but the “grapheme cluster pool” I am imagining 
could be allocated at a certain predefined base address, 
whereby the pointer I am referring to is just an offset from this base address. 
If so, an address space of  2^30  (1,073,741,824) 1 GB, will be available,
which is more than sufficient for just storing unique grapheme clusters..
(of course, not taking in account other allocations and app limitations) 
  
> Today when I need to do random access into a string, I convert it to an 
> Array. Hardly efficient memory-wise, but efficient enough for 
> random access.
> 
As a programmer. I just want to use String as-is but with  direct subscripting 
like str[12..<34]
and, if possible also with open range like so: str[12…]   
implemented natively in Swift. 

Kind Regards
TedvG
www.tedvg.com 
www.ravelnotes.com 
 
> -DW

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread David Waite via swift-evolution

> On Feb 6, 2017, at 10:26 AM, Ted F.A. van Gaalen via swift-evolution 
>  wrote:
> 
> Hi Dave,
> Oops! yes, you’re right!
> I did read again more thoroughly about Unicode 
> and how Unicode is handled within Swift...
> -should have done that before I write something- sorry.  
> 
> Nevertheless: 
> 
> How about this solution:  (if I am not making other omissions in my thinking 
> again) 
> -Store the string as a collection of fixed-width 32 bit UTF-32 characters 
> anyway.
> -however, if the Unicode character is a grapheme cluster (2..n Unicode 
> characters),then 
> store a pointer to a hidden child string containing the actual grapheme 
> cluster, like so:
> 
> 1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
> | 
>  |
> 2:   [UTF32, UTF32]  [UTF32, 
> UTF32, UTF32, ...]
> 
> whereby (1) is aString as seen by the programmer.
> and (2)  are hidden child strings, each containing a grapheme cluster. 

The random access would require a uniform layout, so a pointer and scalar would 
need to be the same size. The above would work with a 32 bit platform with a 
tagged pointer, but would require a 64-bit slot for pointers on 64-bit systems 
like macOS and iOS.

Today when I need to do random access into a string, I convert it to an 
Array. Hardly efficient memory-wise, but efficient enough for random 
access.

-DW
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread Ted F.A. van Gaalen via swift-evolution

We know that:
The cumbersome complexity of current Swift String handling 
and programming is caused by the fact that Unicode characters 
are stored and processed as  streams/arrays with elements 
of variable-width (1...4 bytes for each character) Unicode characters.

Because of that, direct subscripting of string elements e.g. str[2..<18] 
is not possible.Therefore it was, and still is, not implemented in Swift,
much to the unpleasant surprise of many new Swift programmers 
coming from many other PLs like me. They did miss plain direct subscripting 
so much that the first thing ever they do before using Swift intensively is 
implementing the following or similar dreadful code (at least for direct 
subscripting),  and bury it deep into a String extension, once written, 
hopefully never to be seen again, like in this example: 

extension String
{
   subscript(i: Int) -> String
   {
guard i >= 0 && i < characters.count else { return "" }
return String(self[index(startIndex, offsetBy: i)])
}


subscript(range: Range) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), 
limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: 
range.upperBound - range.lowerBound, limitedBy: endIndex) ?? endIndex))
}


subscript(range: ClosedRange) -> String
{
let lowerIndex = index(startIndex, offsetBy: max(0,range.lowerBound), 
limitedBy: endIndex) ?? endIndex
return substring(with: lowerIndex..<(index(lowerIndex, offsetBy: 
range.upperBound - range.lowerBound + 1, limitedBy: endIndex) ?? endIndex))
} 
 }

[splendid jolly good Earl Grey tea is now being served to help those 
flabbergasted to recover as quickly as possible.] 

This rather indirect and clumsy way of working with string data is because 
(with the exception of UTF-32 characters) Unicode characters come in 
variable-width encoding (1 to 4 bytes for each char), which as we know 
makes string handling for UTF-8, UTF-16 very complex and inefficient.
E.g. to isolate a substring it is necessary to sequentially 
traverse the string instead of direct access. 

However, that is not the case with UTF-32, because with UTF-32 encoding
each character has a fixed-width and always occupies exactly 4 bytes, 32 bit. 
Ergo: the problem can be easily solved: The simple solution is to always 
and without exception use UTF-32 encoding as Swift's internal 
string format because it only contains fixed width Unicode characters. 

Unicode strings with whatever UTF encoding as read into the program would 
be automatically converted to 32-bit UTF32 format. Note that explicit 
conversion 
e.g. back to UTF-8, can be specified or defaulted when writing Strings to a 
storage medium or URL etc. 

Possible but imho not recommended: The current String system could be pushed
down and kept alive (e.g. as Type StringUTF8?) as a secondary alternative to 
accommodate those that need to process very large quantities of text in core.


What y'all think?
Kind regards
TedvG
www.tedvg.com 
www.ravelnotes.com  
 




 



 

 


___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-06 Thread Ted F.A. van Gaalen via swift-evolution
Hi Dave,
Oops! yes, you’re right!
I did read again more thoroughly about Unicode 
and how Unicode is handled within Swift...
-should have done that before I write something- sorry.  

Nevertheless: 

How about this solution:  (if I am not making other omissions in my thinking 
again) 
-Store the string as a collection of fixed-width 32 bit UTF-32 characters 
anyway.
-however, if the Unicode character is a grapheme cluster (2..n Unicode 
characters),then 
store a pointer to a hidden child string containing the actual grapheme 
cluster, like so:

1: [UTF32, UTF32, UTF32, 1pointer,  UTF32, UTF32, 1pointer, UTF32, UTF32]
|   
   |
2:   [UTF32, UTF32]  [UTF32, UTF32, 
UTF32, ...]

whereby (1) is aString as seen by the programmer.
and (2)  are hidden child strings, each containing a grapheme cluster. 

To make the distinction between a “plain” single UTF-32 char and a grapheme 
cluster, 
set the most significant bit of the 32 bit value to 1 and use the other 31 bits
as a pointer to another (hidden) String instance, containing the grapheme 
cluster. 
In this way, one could then also make graphemes within graphemes,  
but that is probably not desired? Another solution is to store the grapheme 
clusters
in a dedicated “grapheme pool’, containing the (unique as in aSet) grapheme 
clusters
encountered whenever a Unicode string (in whatever format) is read-in or 
defined at runtime. 

but then again.. seeing how hard it is to recognise Grapheme clusters in the 
first place.. 
? I don’t know. Unicode is complicated..  

Kind regards 
TedvG. 

www.tedvg.com 
www.ravelnotes.com 






> On 6 Feb 2017, at 05:15, Dave Abrahams  wrote:
> 
> 
> 
>> On Feb 5, 2017, at 2:57 PM, Ted F.A. van Gaalen  
>> wrote:
>> 
>> However, that is not the case with UTF-32, because with UTF-32 encoding
>> each character has a fixed-width and always occupies exactly 4 bytes, 32 
>> bit. 
>> Ergo: the problem can be easily solved: The simple solution is to always 
>> and without exception use UTF-32 encoding as Swift's internal 
>> string format because it only contains fixed width Unicode characters. 
> 
> Those are not (user-perceived) Characters; they are Unicode Scalar Values 
> (often called "characters" by the Unicode standard.  Characters as defined in 
> Swift (a.k.a. extended grapheme clusters) have no fixed-width encoding, and 
> Unicode scalar values are an inappropriate unit for most string processing. 
> Please read the manifesto for details.
> 
> Sent from my iPad

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-05 Thread Dave Abrahams via swift-evolution


> On Feb 5, 2017, at 2:57 PM, Ted F.A. van Gaalen  wrote:
> 
> However, that is not the case with UTF-32, because with UTF-32 encoding
> each character has a fixed-width and always occupies exactly 4 bytes, 32 bit. 
> Ergo: the problem can be easily solved: The simple solution is to always 
> and without exception use UTF-32 encoding as Swift's internal 
> string format because it only contains fixed width Unicode characters. 

Those are not (user-perceived) Characters; they are Unicode Scalar Values 
(often called "characters" by the Unicode standard.  Characters as defined in 
Swift (a.k.a. extended grapheme clusters) have no fixed-width encoding, and 
Unicode scalar values are an inappropriate unit for most string processing. 
Please read the manifesto for details.

Sent from my iPad
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-05 Thread Ben Cohen via swift-evolution


> On Feb 5, 2017, at 02:22, Jonathan Hull via swift-evolution 
>  wrote:
> 
> Just out of curiosity, what are the use-cases for an infinite sequence
> (as opposed to a sequence which is bounded to the type’s representable
> values)?
 
 1. The type may not have an inherent expressible bound (see BigInt,
  UnsafePointer, and *many* real-life Index types).
>>> 
>>> If I understand what you are saying, this is why I was arguing that
>>> these partial ranges should not, by themselves, conform to Sequence.
>>> In cases where you do have a natural expressible bound, then it should
>>> conditionally conform to sequence.  
>> 
>> I'm sorry, I don't understand that last sentence.  The only way to
>> conditionally conform in one particular case is to make the conformance
>> condition include detection of that case. I could guess at what you mean
>> but I'd rather you explain yourself.  Specifically, which partial ranges
>> should conform to Sequence, and which shouldn't, in your view?
> 
> Oh, sorry, I was referring to an earlier post I made in this thread, but 
> didn’t make that clear.  I’ll give a longer explanation of my reasoning 
> behind the statement that partial ranges shouldn’t conform to Sequence (on 
> their own) below... but the short version is that the conformance, like cake, 
> is a lie... and we are using the trap to lie to ourselves about it.
> 
> 
>>> In other cases, you should have to write that conformance yourself if
>>> you want it.
>> 
>> You've asserted that things should be a certain way (which I don't fully
>> understand but I hope you'll explain), but you haven't said anything to
>> clarify *why* you think they should be that way.
> 
> I believe that the ill-defineness/fuzziness of this issue that others have 
> eluded to and the desire to trap is telling us that we are fighting the 
> underlying structure of the types.  I get that trapping  can save us from an 
> entire class of nasty bugs, but I also believe that they should be a last 
> resort in design because they have a tendency to let us hide subtle design 
> mistakes from ourselves.
> 
> If we continue down this path, we will find more and more edge cases that 
> need to be papered over, until the design eventually needs to be scrapped or 
> buttressed with compiler magic.  For example, let’s say we define a 
> PartialRange as a simple struct with optional endpoints, then we conform it 
> to Sequence.  Now we can say ’n…’ and get a sequence.  But what happens when 
> we say ‘…n’?  

My opinion: ...n should definitely not be a Sequence, and I think any design 
for one-sided ranges needs to ensure this in the type system not at runtime. 
There might be practical benefits for not doing this, but I think the bar for 
accepting that trade-off should be high.

> Where does that sequence start if it has an implicit start point of negative 
> infinity?  Well, we promised a Sequence, so our only real option is to trap 
> (for some types anyway).  For UInt, ‘…n’ is a perfectly valid sequence

...n is not a valid sequence regardless of the type for n. Just because some 
types might have a "well, I guess that makes sense" value for the lower bound 
doesn't mean that's the appropriate choice for all uses. I'm not sure why you 
think UInt has a valid value for ...n but Int doesn't. Why is 0 any more 
special than Int.min?

> (though we could trap on that as well for consistency sake and say that you 
> have to say '0…n’).  We are using trapping to let us ignore the fact that we 
> are promising something (conformance to Sequence) that isn’t really true...

In what way is it not true? Things conform to Sequence when they have an 
implementation of makeIterator() that gives you a value (in O(1) time) that 
yields values via next() based on some sensible documented behavior (though 
some protocols like Collection impose some further rules on their Sequence 
conformance). We are not seeking some higher truth here, we are building useful 
tools.

> In this case, I believe we are trying to shoehorn a number of different 
> fuzzily defined requirements that almost, but don’t quite fit together:
> 
> 1) RangeExpressions should work where Ranges work now
> 2) We want to be able to slice a collection using ’n…’ and ‘…n’, and the 
> implicit boundary should be filled in by the collection
> 3) We want to be able to create a sequence from ’n…’ that can be zipped and 
> iterated over

These definitions don't seem fuzzy to me, they seem sufficiently precise, 
especially when you put back the part you dropped from my original definition 
for 3: "The behavior of that sequence’s iterator is that it starts at the lower 
bound, and increments indefinitely." 1 is an implementation detail IMO rather 
than a functional goal.

> Looking at our requirements above, let’s start with #1.  Ranges (which are 
> countable) conform to Collection.  We keep saying we need these to conform to 
> Sequence (because of 

Re: [swift-evolution] Strings in Swift 4

2017-02-05 Thread Jonathan Hull via swift-evolution
 Just out of curiosity, what are the use-cases for an infinite sequence
 (as opposed to a sequence which is bounded to the type’s representable
 values)?
>>> 
>>> 1. The type may not have an inherent expressible bound (see BigInt,
>>>  UnsafePointer, and *many* real-life Index types).
>> 
>> If I understand what you are saying, this is why I was arguing that
>> these partial ranges should not, by themselves, conform to Sequence.
>> In cases where you do have a natural expressible bound, then it should
>> conditionally conform to sequence.  
> 
> I'm sorry, I don't understand that last sentence.  The only way to
> conditionally conform in one particular case is to make the conformance
> condition include detection of that case. I could guess at what you mean
> but I'd rather you explain yourself.  Specifically, which partial ranges
> should conform to Sequence, and which shouldn't, in your view?

Oh, sorry, I was referring to an earlier post I made in this thread, but didn’t 
make that clear.  I’ll give a longer explanation of my reasoning behind the 
statement that partial ranges shouldn’t conform to Sequence (on their own) 
below... but the short version is that the conformance, like cake, is a lie... 
and we are using the trap to lie to ourselves about it.


>> In other cases, you should have to write that conformance yourself if
>> you want it.
> 
> You've asserted that things should be a certain way (which I don't fully
> understand but I hope you'll explain), but you haven't said anything to
> clarify *why* you think they should be that way.

I believe that the ill-defineness/fuzziness of this issue that others have 
eluded to and the desire to trap is telling us that we are fighting the 
underlying structure of the types.  I get that trapping  can save us from an 
entire class of nasty bugs, but I also believe that they should be a last 
resort in design because they have a tendency to let us hide subtle design 
mistakes from ourselves.

If we continue down this path, we will find more and more edge cases that need 
to be papered over, until the design eventually needs to be scrapped or 
buttressed with compiler magic.  For example, let’s say we define a 
PartialRange as a simple struct with optional endpoints, then we conform it to 
Sequence.  Now we can say ’n…’ and get a sequence.  But what happens when we 
say ‘…n’?  Where does that sequence start if it has an implicit start point of 
negative infinity?  Well, we promised a Sequence, so our only real option is to 
trap (for some types anyway).  For UInt, ‘…n’ is a perfectly valid sequence 
(though we could trap on that as well for consistency sake and say that you 
have to say '0…n’).  We are using trapping to let us ignore the fact that we 
are promising something (conformance to Sequence) that isn’t really true...

In this case, I believe we are trying to shoehorn a number of different fuzzily 
defined requirements that almost, but don’t quite fit together:

1) RangeExpressions should work where Ranges work now
2) We want to be able to slice a collection using ’n…’ and ‘…n’, and the 
implicit boundary should be filled in by the collection
3) We want to be able to create a sequence from ’n…’ that can be zipped and 
iterated over

Looking at our requirements above, let’s start with #1.  Ranges (which are 
countable) conform to Collection.  We keep saying we need these to conform to 
Sequence (because of requirement 3), but I think what we really want/mean in a 
lot of cases is conformance to Collection.  The problem is that we don’t quite 
have enough information available to do that, so PartialRanges cannot be used 
in generic functions on Collection.  This is exposing a weakness of our 
Sequence/Collection protocols around infinite Sequences/Collections that we 
talked about a few months back, but ran out of time to solve (and I didn’t have 
bandwidth at the time to work on it).

As others have pointed out, 2 & 3 are similar, but have slightly different 
semantics.  In the case of 2, we have an implicit boundary being filled in: 
’n…’ means a range from ’n’ up to that boundary. We have explicitly stated we 
don’t want it to trap here, because that would make it useless.  In the case of 
3, we are giving it a different meaning: ’n…’ means starting with n, count 
higher until you run out of numbers… then, crash.  Here you are explicitly 
asking for trapping behavior.  These appear very similar at first glance, but 
need to be thought about subtly differently. For example, we have to be 
extremely careful using #3 with functional map/filter/reduce, and things like 
‘count’ would instantly crash (how do you count an infinity?).  We need to be 
really careful with #3 in general, because it breaks so many of the implicit 
guarantees around Collection (thus fulfilling #3, stops us from fulfilling #1). 
As I showed above, it also breaks the guarantees for Sequence when the starting 
bound is missing.


What I am suggesting is a small shift 

Re: [swift-evolution] Strings in Swift 4

2017-02-04 Thread Dave Abrahams via swift-evolution

on Sat Feb 04 2017, Saagar Jha  wrote:

> Sorry, it looks like I left you hanging on this–luckily I found it when I was 
> cleaning my inbox.
>
> Overall, I believe the issue I have with the Swift String indexing
> model is that indices cannot be operated on like an Int can–you can
> multiply, divide, square, whatever you want on integer indices, while
> String.Index only allows for what is essentially addition and
> subtraction. Now, I get that these operations may not make sense on
> most Strings; the existing API covers them well. However, there are
> cases, where these operations would be convenient; such as when
> dealing with fixed-length records or tables of data; almost invariably
> these are stored as ASCII. Thus, for these cases, I believe that there
> should be some way to let String know that we are dealing with
> something that is purely ASCII, so that it can allow us to use these
> operations in an efficient manner (for example, having an optional
> .asciiString property that conforms to RandomAccess; since I don’t
> believe that extendedASCII does). 

We could decide to make it random access at the cost of ruling out some
less-used but still-significant encodings of the string's backing store,
such as Shift-JIS.  I personally am unconvinced that the marginal extra
convenience gained by random access to extendedASCII would be worth the
loss of the ability to operate directly on such encodings.

> Such an API would keep the existing String paradigm, which is what is
> needed most of the time, but allowing for random access when the data
> can be guaranteed to support it.

We can easily make an ASCIIString that conforms to Unicode and provides
RandomAccessCollection conformance to all of its views.  That random
access would not be preserved **in the type system** when ASCIIString is
wrapped in a String—the String's ExtendedASCIIView would only conform to
BidirectionalCollection—but the underlying efficiency characteristics
*would* be preserved, dynamically.

> I’m not sure if I’m getting my point across, please do let me know if
> you don’t quite get what I mean.

I'm pretty sure I get what you mean.  Let me know if you don't think so.

> Saagar Jha
>
>> On Jan 20, 2017, at 5:55 PM, Ben Cohen  wrote:
>> 
>> 
>>> On Jan 20, 2017, at 2:58 PM, Saagar Jha via swift-evolution
>>> >> >
>>> wrote:
>>> 
>>> Sorry if I wasn’t clear; I’m looking for indexing using Int, instead of 
>>> using formIndex.
>> 
>> 
>> Question: why do you think integer indices are so desirable?
>> 
>> Integer indexing is simple, but also encourages anti-patterns
>> (tortured open-coded while loops with unexpected fencepost errors,
>> conflation of positions and distances into a single type) and our
>> goal should be to make most everyday higher-level operations, such
>> as finding/tokenizing, so easy that Swift programmers don’t feel
>> they need to resort to loops as often.
>> 
>> Examples where formIndex is so common yet so cumbersome that it
>> would be worth efforts to create integer-indexed versions of string
>> might be indicators of important missing features on our collection
>> or string APIs. So do pass them along.
>> 
>> (There are definitely known gaps in them today – slicing needs
>> improving as the manifesto mentions for things like slices from an
>> index to n elements later. Also, we need support for in-place
>> remove(where:) operations. But the more commonly needed cases we
>> know about that aren’t covered, the better)
>> 
>> 
>
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>

-- 
-Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-04 Thread Saagar Jha via swift-evolution
Sorry, it looks like I left you hanging on this–luckily I found it when I was 
cleaning my inbox.

Overall, I believe the issue I have with the Swift String indexing model is 
that indices cannot be operated on like an Int can–you can multiply, divide, 
square, whatever you want on integer indices, while String.Index only allows 
for what is essentially addition and subtraction. Now, I get that these 
operations may not make sense on most Strings; the existing API covers them 
well. However, there are cases, where these operations would be convenient; 
such as when dealing with fixed-length records or tables of data; almost 
invariably these are stored as ASCII. Thus, for these cases, I believe that 
there should be some way to let String know that we are dealing with something 
that is purely ASCII, so that it can allow us to use these operations in an 
efficient manner (for example, having an optional .asciiString property that 
conforms to RandomAccess; since I don’t believe that extendedASCII does). Such 
an API would keep the existing String paradigm, which is what is needed most of 
the time, but allowing for random access when the data can be guaranteed to 
support it.

I’m not sure if I’m getting my point across, please do let me know if you don’t 
quite get what I mean.

Saagar Jha

> On Jan 20, 2017, at 5:55 PM, Ben Cohen  wrote:
> 
> 
>> On Jan 20, 2017, at 2:58 PM, Saagar Jha via swift-evolution 
>> > wrote:
>> 
>> Sorry if I wasn’t clear; I’m looking for indexing using Int, instead of 
>> using formIndex.
> 
> 
> Question: why do you think integer indices are so desirable?
> 
> Integer indexing is simple, but also encourages anti-patterns (tortured 
> open-coded while loops with unexpected fencepost errors, conflation of 
> positions and distances into a single type) and our goal should be to make 
> most everyday higher-level operations, such as finding/tokenizing, so easy 
> that Swift programmers don’t feel they need to resort to loops as often.
> 
> Examples where formIndex is so common yet so cumbersome that it would be 
> worth efforts to create integer-indexed versions of string might be 
> indicators of important missing features on our collection or string APIs. So 
> do pass them along.
> 
> (There are definitely known gaps in them today – slicing needs improving as 
> the manifesto mentions for things like slices from an index to n elements 
> later. Also, we need support for in-place remove(where:) operations. But the 
> more commonly needed cases we know about that aren’t covered, the better)
> 
> 

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-04 Thread Jonathan Hull via swift-evolution

> On Feb 2, 2017, at 2:19 PM, Dave Abrahams  wrote:
> 
> 
> on Thu Feb 02 2017, Jonathan Hull  wrote:
> 
>> Just out of curiosity, what are the use-cases for an infinite sequence
>> (as opposed to a sequence which is bounded to the type’s representable
>> values)?
> 
> 1. The type may not have an inherent expressible bound (see BigInt,
>   UnsafePointer, and *many* real-life Index types).

If I understand what you are saying, this is why I was arguing that these 
partial ranges should not, by themselves, conform to Sequence.  In cases where 
you do have a natural expressible bound, then it should conditionally conform 
to sequence.  In other cases, you should have to write that conformance 
yourself if you want it.


> 
> 2. I keep repeating variants of this example:
> 
>  func listElements<
>S: Sequence, N: Number
>> (of s: S, numberedFrom start: N) {
>for (n, e) in zip(start..., s) {
>  print("\(n). \(e)")
>}
>  }
> 
>  which avoids incorrect behavior when N turns out to be a type that
>  can't represent values high enough to list everything in s—**if and
>  only if** `start...` is an unbounded range, rather than one that
>  implicitly gets its upper bound from its type.

I really worry about the possibility of long-term foot-gunning in this case.  I 
showed this to a friend who maintains old/abandoned codebases for a living, and 
he said “Oh god, that isn’t going to actually crash until 5 years in, when it 
gets passed some rare type of file… and by then the original programmer will 
have left.”  He hit on exactly what I had been feeling as well (but hadn’t been 
able to put into words until I heard his).  The thought is that this will help 
the programmer find an error by trapping, but because it is dependent on the 
interactions of two subsystems, it will often create one of those crashes where 
the stars have to align for it to happen (which are my least favorite type of 
bug/crash to debug).


I think this example also shows how my suggestion of requiring an extra 
protocol for conformance to Sequence would actually be much more effective in 
preventing the programmer error…

You would not have been able to write the function the way you did, because 
writing ‘start…’ as a sequence would have required the addition of ‘where 
N:FiniteComparable’ (or whatever we call the protocol providing natural 
bounds).  In having to write that N needs bounds, you are much more likely to 
be thinking of what those bounds might be.  The problem here is really with the 
design of the listElements function in allowing/encouraging that error, as 
opposed to the caller... and the trap is more likely to make you look at the 
call site because of the context of when it happens. Having to write the 
‘where’ statement makes you reconsider the design of the function, again 
because of the context of when it happens.

Thanks,
Jon








___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-03 Thread Dave Abrahams via swift-evolution

on Thu Feb 02 2017, Xiaodi Wu  wrote:

> On Thu, Feb 2, 2017 at 9:45 AM, Dave Abrahams via swift-evolution <
> swift-evolution@swift.org> wrote:
>
> If indeed the desired semantics for ranges is that they should continue to
> lack precise semantics, then an agreement that we are going into this
> deliberately and clear documentation to that effect is the next best thing,
> I suppose.

Practically speaking, using the type system to separate ranges that are
collections from the ones that merely bound some values would not be
helpful to anyone, IMO.  If that's what you mean by saying ranges have
imprecise semantics, then that's what I, at least, desire.

A more powerful way to look at it, IMO, is that ranges unambiguously
represent one or two bounds on comparable values, and there are
operations that combine those bounds with other entities (for loops,
collections whose indices match the ranges) in useful ways.

-- 
-Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-03 Thread Jordan Rose via swift-evolution
A bit late, but I agree with Ben on all of this. We could have separate 
IncompleteRange and InfiniteRange types, and have a consistent world, and even 
figure out how to get the context-less case to pick InfiniteRange. But we don’t 
have to, and I don’t think it buys us anything because—as was said—the two 
kinds of ranges are never used in the same context. There’s no overload madness.

On the specific point that `for i in 0…` behaves differently than `for elem in 
arr[0…]`, I think that’s just picking one interpretation of “arr[0…]” and then 
(legitimately) claiming it leads to nonsense. I see no problem saying that a 
subscript for an IncompleteRange, or LowerBoundedRange, or whatever gives you 
the elements starting at that index, but that trying to iterate such a range 
(when Countable) gives you unlimited iteration. I think this is more a 
theoretical concern than a practical one.

Jordan


> On Feb 1, 2017, at 10:37, Ben Cohen via swift-evolution 
>  wrote:
> 
> 
> I think Dave has already made these points separately but it probably helps 
> to recap more explicitly the behavior we expect from “x…"
> 
> Names are just for discussion purposes, exact naming can be left as a 
> separate discussion from functionality:
> 
> - Informally, one-sided ranges are a thing. Formally, there are lower-bounded 
> one-sided ranges, which are created with a postfix ... operator. For now, 
> let’s just fully understand them and come back to upper-bounded ranges later.
> - Collections will have a subscript that takes a one-sided range and returns 
> a SubSequence. The behavior of that subscript is that the collection "fills 
> in" the “missing” side with it’s upper/lower bound and uses that two-sided 
> range to return a slice.*
> - When the Bound type of a lower-bounded range is countable, it will conform 
> to Sequence.** The behavior of that sequence’s iterator is that it starts at 
> the lower bound, and increments indefinitely.
> - One-sided ranges should have ~= defined for use with switch statements, 
> where any value above the bound for a lower-bounded range would return true.
> 
> * implementation detail: collections would probably have a generic subscript 
> taking a type conforming to RangeExpression, and that protocol would have a 
> helper extension for filling in the missing range given a collection
> **one-sided ranges ought in fact to be infinite Collections… a concept that 
> needs a separate thread
> 
> With that defined:
> 
>> On Feb 1, 2017, at 5:02 AM, Xiaodi Wu via swift-evolution 
>> > wrote:
>> 
>> I entirely agree with you on the desired behavior of `zip(...)`.
>> 
>> However, if you insist on 0... being notionally an infinite range, then you 
>> would have to insist on `for i in 0...` also trapping. Which is not a big 
>> deal, IMO, but will surely make the anti-trapping crowd upset.
>> 
> 
> Certainly, for i in 0… would trap once the iterator needs to increment past 
> Int.max. If you break out of the loop before this happens, it won’t trap. The 
> statement is the moral equivalent of a C-style for(i = 0; /*nothing*/ ; ++i).
> 
> If the anti-trapping crowd are upset, they should be upset with Int, not this 
> range type. The range has no opinion on trapping – its iterator just 
> increments its Bounds type when asked to.
> 
>> The bigger issue is that either you must have a lenient/clamping subscript 
>> for `arr[0...]` or it too must trap, which is not desired. 
> 
> Based on the definition I give above, arr[0…] means “from 0 up to the 
> endIndex of arr”. This will not trap.
> 
> (there is a legitimate quibble here that this will translate into 
> arr[0.. invalid. But 0..< is ugly so we should ignore this quibble for the sake of 
> aesthetics)
> 
>> However, if `arr[0...]` is clamping, then `[1, 2, 3][100...]` would not trap 
>> and instead give you `[]`.
>> 
> 
> It should trap, because 100.. slicing this array.
> 
>> If 0... is regarded as an incomplete range, your example of `zip(...)` could 
>> still trap as desired. It would trap on the notional attempt to assign 
>> someArray.count to IncompleteRange.inferredUpperBound if count exceeds 
>> T.max. 
> 
> It’s unclear to me whether you are trying to formally define 
> .inferredUpperBound as a property you expect to exist, or if you’re using it 
> as informal shorthand. But there is no implied upper bound to a one-sided 
> lower-bounded range, only an actual lower bound. Operations taking 
> lower-bounded ranges as arguments can infer their own upper bound from 
> context, or not, depending on the functionality they need.
> 
> As an example of an alternative inferred upper bound: suppose you want to 
> define your own ordering for ranges, for sorting/display purposes. You decide 
> on a lexicographic ordering first by lower then upper bound. You want 
> one-sided ranges to 

Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Xiaodi Wu via swift-evolution
On Thu, Feb 2, 2017 at 9:45 AM, Dave Abrahams via swift-evolution <
swift-evolution@swift.org> wrote:

>
>
> Sent from my iPad
>
> On Feb 2, 2017, at 7:11 AM, Matthew Johnson 
> wrote:
>
>
>
> Furthermore, we emphatically do *not* need to make the distinction you
> claim between “infinite” and “incomplete” ranges, which *is* needless
> hairsplitting.
>
>
> Strongly disagree, unless you can describe the semantics of the type
> WITHOUT giving it different semantics depending on how it is used.
>
>
> This is the point that convinced me.  I’m going to take a closer look at
> Brent’s `RangeExpression` design which I must admit I only skimmed in the
> earlier discussion.
>
>
> *We already have exactly this situation* with CountableRange (which will
> merge with Range when conditional conformances land).  When used as a
> Collection, it means "every index value starting with the lowerBound and
> ending just before the upperBound".  When used for slicing, it means,
> roughly, "take every one of the collection's indices that are in bounds."
>  These are *not* the same thing.  A collection's indices* need not
> include every expressible value of the Index type between startIndex and
> endIndex*.
>

Now this is a reasonably convincing argument.

However, the conflation you describe surely comes at a price. I would bet
that if you polled ordinary Swift users, many would assume that being able
to write `myValue[startIndex.. The whole point of the name *RangeExpression* is to acknowledge this
> truth: ranges in Swift bits of syntax whose meaning is given partly by how
> they are used.
>

If indeed the desired semantics for ranges is that they should continue to
lack precise semantics, then an agreement that we are going into this
deliberately and clear documentation to that effect is the next best thing,
I suppose.


> In fact, now that I say it, in that respect ranges are not all that
> different any other type: the meaning of a Double or an Array or a
> Bool is also interpreted by the methods to which it is passed, and can have
> completely different results depending on that context.
>

I'm not sure I understand this comment. Surely the semantic meaning of a
Double is not any more or less fluid than the semantics of the number being
modeled (for instance, 42), nor that of a Bool any more or less fluid than
the semantics of truth or falsity. But we are getting far afield here.

What I'm aiming at is that the proposed "one-sided ranges" are fuzzy on
what it is they are modeling. Now, if the community decides that this
ambiguity is a desirable thing, then so be it. I happen to think it isn't
so desirable. But in any case the decision ought to be made on the basis
that the ambiguity is worth it when exchanged for
[intuitiveness/expressiveness/whatever other advantages], not on denying
that there is ambiguity at all.

chillaxing-ly y'rs,
>
> Dave
>
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution
>
>
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution

on Thu Feb 02 2017, Jonathan Hull  wrote:

> Just out of curiosity, what are the use-cases for an infinite sequence
> (as opposed to a sequence which is bounded to the type’s representable
> values)?

1. The type may not have an inherent expressible bound (see BigInt,
   UnsafePointer, and *many* real-life Index types).

2. I keep repeating variants of this example:

  func listElements<
S: Sequence, N: Number
  >(of s: S, numberedFrom start: N) {
for (n, e) in zip(start..., s) {
  print("\(n). \(e)")
}
  }

  which avoids incorrect behavior when N turns out to be a type that
  can't represent values high enough to list everything in s—**if and
  only if** `start...` is an unbounded range, rather than one that
  implicitly gets its upper bound from its type.

-- 
-Dave
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Erica Sadun via swift-evolution

> On Feb 2, 2017, at 12:35 PM, Erica Sadun via swift-evolution 
>  wrote:
> 
> 
>> On Feb 2, 2017, at 8:58 AM, Jonathan Hull via swift-evolution 
>>  wrote:
>> 
>> Just out of curiosity, what are the use-cases for an infinite sequence (as 
>> opposed to a sequence which is bounded to the type’s representable values)?
>> 
>> Thanks,
>> Jon
> 
> Now that drop(while:) and prefix(while:) have dropped, they're a lot nicer 
> than having to do sequence(first: 1, next: { $0 + 1 })
> 
> -- E

Actually after thinking about that, I withdraw that suggestion.

-- E

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Matthew Johnson via swift-evolution

> On Feb 2, 2017, at 1:19 PM, Dave Abrahams via swift-evolution 
>  wrote:
> 
> 
> on Thu Feb 02 2017, Matthew Johnson  wrote:
> 
>>> On Feb 2, 2017, at 9:45 AM, Dave Abrahams 
>> wrote:
>>> 
>>> 
>>> 
>>> 
>> 
>>> 
>>> On Feb 2, 2017, at 7:11 AM, Matthew Johnson >> > wrote:
>>> 
> 
>> 
>> Furthermore, we emphatically do *not* need to make the
>> distinction you claim between “infinite” and “incomplete” ranges,
>> which *is* needless hairsplitting.
> 
> Strongly disagree, unless you can describe the semantics of the type 
> WITHOUT giving it different semantics depending on how it is used.
 
 This is the point that convinced me.  I’m going to take a closer
 look at Brent’s `RangeExpression` design which I must admit I only
 skimmed in the earlier discussion.
>>> 
>>> We already have exactly this situation with CountableRange (which
>>> will merge with Range when conditional conformances land).  When
>>> used as a Collection, it means "every index value starting with the
>>> lowerBound and ending just before the upperBound".  When used for
>>> slicing, it means, roughly, "take every one of the collection's
>>> indices that are in bounds.”
>> 
>> I don’t see how the behavior of the following code means roughly “take
>> every one of the collection’s indices that are in bounds”.  Can you
>> elaborate?
>> 
>> let range = 0..<20
>> let array = [1, 2, 3]
>> let slice = array[range] // trap on index out of bounds
> 
> “Roughly” means “I'm leaving out the part that it's a precondition that
> any (explicit) bound must be a valid index in the collection.”  
> 
>>> These are not the same thing.  A collection's indices need not
>>> include every expressible value of the Index type between startIndex
>>> and endIndex.
>> 
>> Sure, but it does appear as if the behavior of slicing assumes that
>> the upper and lower bounds of the range provided are valid indices.
> 
> Properly stated, it's a precondition.  This is a design decision.  We could
> have made slicing lenient, like Python's:
> 
 [1, 2][50]
> 
 [1, 2][5:10]
> []
> 
> We thought that in Swift it was more important to catch potential errors
> than to silently accept
> 
>[1, 2][5..<10]
> 
> we still have the option to loosen slicing so it works that way, but I
> am not at all convinced it would be an improvement.

I agree with the current behavior as it’s consistent with indexing.

Lenient slicing is in the same space as lenient indexing.  It’s easy enough to 
add lenient wrappers around the standard library’s trapping implementation.

If we ever decided to have both variants one would have to include a label and 
I don’t think anybody would support labeling the trapping variant (for many 
reasons).

> 
>> Xiaodi had convinced me that a one-sided range must take a position on
>> whether or not it has an infinite upper bound in order to make sense,
>> but now you’ve changed my mind back.
> 
> Phew!  To be clear:
> 
>  A one-sided range is *partial*.  It *has no* upper bound.

Ha!  I actually used the term “partial range" earlier.  :)

The distinction between having an infinite upper bound and having no upper 
bound, but still conforming to `Sequence` is pretty subtle.  

Ultimately I think having a really good name (`RangeExpression`) that implies 
that the usage site determines the meaning is super important to making this 
work.  Kudos to whoever came up with that name!

> 
>>> The whole point of the name RangeExpression is to acknowledge this
>>> truth: ranges in Swift bits of syntax whose meaning is given partly
>>> by how they are used.
>> 
>> This makes sense and is roughly the position I started with.  I should
>> have read through the archives of this thread more before jumping into
>> the middle of the discussion - I was missing some important context.
>> I apologize for not doing so.
> 
> No apology needed.  I really appreciate how everyone on this list works
> hard to discover the underlying truth in Swift's design.  Without that
> kind of engagement, Swift would be far worse off.
> 
> Thanks, everybody

Agree.  And I really appreciate the willingness of the core team to spend 
valuable time engaging with the community and helping to illuminate the design. 
 You have all spent significantly more time thinking about the “underlying 
truth” of the design than the rest of us!  

> 
> -- 
> -Dave
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution

on Thu Feb 02 2017, Jonathan Hull  wrote:

> I really like the IncompleteRange/RangeExpression idea!
>
> I don’t think that IncompleteRange/RangeExpression should, by themselves, 
> conform to Sequence. It
> seems like necessary information is missing there.  Instead, there needs to 
> be a conditional
> conformance to Sequence based on another protocol that provides the natural 
> bounds for the Bound
> type.
>
> For example, what if we have another protocol:
>
>   protocol FiniteComparable : Comparable { //Any finite set which is 
> comparable will have a
> lowest value and a highest value...
>   static var lowestValue:Self {get}
>   static var highestValue:Self {get}
>   }
>
> Something like UInt would have a lowestValue of 0 and highestValue of 
> (UInt.max -1).  Then you could
> conditionally conform a RangeExpression where the Bounds are FiniteComparable 
> to Sequence.
>
> Now the behavior is consistent.  In the case of ‘array[0…]’ the array is 
> providing the missing upper
> bound.  In the case of ‘for n in 0…’ the conformance to FiniteComparable is 
> providing the bound (and
> it doesn’t trap, because it is enumerating all values IN that type above the 
> lower bound).
>
>   for n in UInt8(0)… {/* Will get called for every possible value of 
> UInt8 */}
>
> I agree that trapping when an infinite sequence of integers goes past the max 
> value is the only
> reasonable thing to do in that situation... but since we get to define the 
> bounds (and we have
> defined them in other cases to be the largest usable value), why not define 
> them the same way here
> (i.e. not infinite).  In this case, I don’t see the added value in making the 
> sequence infinite
> instead of just bounded by what the type can represent.  The only thing it 
> seems it adds is the
> trapping behavior.  With the natural bound, you can use things like filter on 
> partially defined
> ranges (which would trap if they are defined as infinite):
>
>   let odd:[UInt8] = (0…).filter({$0 & 1 != 0}) //returns an array of all 
> the odd UInt8

Why store them?  This is all the odd integers, and it traps only when
you exceed the machine's ability to express the values.

let odd = (0...).lazy.filter({$0 & 1 != 0}) 

If you really want a range that is limited to the Int8s, just use
Int8.max as your upper bound on a closed range.  That's exactly what
closed ranges and .max static properties are for. It's much better to
explicitly say, “I expect this range to stop” than for us to potentially
hide bugs by silently bounding the range at a value that depends on type
context.

-- 
-Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution

on Thu Feb 02 2017, Matthew Johnson  wrote:

>> On Feb 2, 2017, at 9:45 AM, Dave Abrahams 
> wrote:
>> 
>> 
>> 
>> 
>
>> 
>> On Feb 2, 2017, at 7:11 AM, Matthew Johnson > > wrote:
>> 
 
> 
> Furthermore, we emphatically do *not* need to make the
> distinction you claim between “infinite” and “incomplete” ranges,
> which *is* needless hairsplitting.
 
 Strongly disagree, unless you can describe the semantics of the type 
 WITHOUT giving it different semantics depending on how it is used.
>>> 
>>> This is the point that convinced me.  I’m going to take a closer
>>> look at Brent’s `RangeExpression` design which I must admit I only
>>> skimmed in the earlier discussion.
>> 
>> We already have exactly this situation with CountableRange (which
>> will merge with Range when conditional conformances land).  When
>> used as a Collection, it means "every index value starting with the
>> lowerBound and ending just before the upperBound".  When used for
>> slicing, it means, roughly, "take every one of the collection's
>> indices that are in bounds.”
>
> I don’t see how the behavior of the following code means roughly “take
> every one of the collection’s indices that are in bounds”.  Can you
> elaborate?
>
> let range = 0..<20
> let array = [1, 2, 3]
> let slice = array[range] // trap on index out of bounds

“Roughly” means “I'm leaving out the part that it's a precondition that
any (explicit) bound must be a valid index in the collection.”  

>> These are not the same thing.  A collection's indices need not
>> include every expressible value of the Index type between startIndex
>> and endIndex.
>
> Sure, but it does appear as if the behavior of slicing assumes that
> the upper and lower bounds of the range provided are valid indices.

Properly stated, it's a precondition.  This is a design decision.  We could
have made slicing lenient, like Python's:

 >>> [1, 2][50]
 
 >>> [1, 2][5:10]
 []

We thought that in Swift it was more important to catch potential errors
than to silently accept

[1, 2][5..<10]

we still have the option to loosen slicing so it works that way, but I
am not at all convinced it would be an improvement.

> Xiaodi had convinced me that a one-sided range must take a position on
> whether or not it has an infinite upper bound in order to make sense,
> but now you’ve changed my mind back.

Phew!  To be clear:

  A one-sided range is *partial*.  It *has no* upper bound.

>> The whole point of the name RangeExpression is to acknowledge this
>> truth: ranges in Swift bits of syntax whose meaning is given partly
>> by how they are used.
>
> This makes sense and is roughly the position I started with.  I should
> have read through the archives of this thread more before jumping into
> the middle of the discussion - I was missing some important context.
> I apologize for not doing so.

No apology needed.  I really appreciate how everyone on this list works
hard to discover the underlying truth in Swift's design.  Without that
kind of engagement, Swift would be far worse off.

Thanks, everybody

-- 
-Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Erica Sadun via swift-evolution

> On Feb 2, 2017, at 8:58 AM, Jonathan Hull via swift-evolution 
>  wrote:
> 
> Just out of curiosity, what are the use-cases for an infinite sequence (as 
> opposed to a sequence which is bounded to the type’s representable values)?
> 
> Thanks,
> Jon

Now that drop(while:) and prefix(while:) have dropped, they're a lot nicer than 
having to do sequence(first: 1, next: { $0 + 1 })

-- E


___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution

on Thu Feb 02 2017, Nevin Brackett-Rozinsky  wrote:

> So, not to be “that guy”, but does anyone recall what the status of the
> actual *String* revamp for Swift 4 currently is?
>
> There was a lot of good discussion on the matter, and I want to make sure
> it hasn’t gotten lost or dropped.

Thanks for bringing the thread back to its focus!  Nothing's been
dropped.  Part of the reason we published the manifesto was so that we
could gather feedback from the community.  The whole discussion is
archived, and we've been taking notes.

I would be very interested in any further discussion of Strings, if
there's anything left to discuss.

-- 
-Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Matthew Johnson via swift-evolution

> On Feb 2, 2017, at 9:45 AM, Dave Abrahams  wrote:
> 
> 
> 
> Sent from my iPad
> 
> On Feb 2, 2017, at 7:11 AM, Matthew Johnson  > wrote:
> 
>>> 
 
 Furthermore, we emphatically do *not* need to make the distinction you 
 claim between “infinite” and “incomplete” ranges, which *is* needless 
 hairsplitting.
>>> 
>>> Strongly disagree, unless you can describe the semantics of the type 
>>> WITHOUT giving it different semantics depending on how it is used.
>> 
>> This is the point that convinced me.  I’m going to take a closer look at 
>> Brent’s `RangeExpression` design which I must admit I only skimmed in the 
>> earlier discussion.
> 
> We already have exactly this situation with CountableRange (which will merge 
> with Range when conditional conformances land).  When used as a Collection, 
> it means "every index value starting with the lowerBound and ending just 
> before the upperBound".  When used for slicing, it means, roughly, "take 
> every one of the collection's indices that are in bounds.”  

I don’t see how the behavior of the following code means roughly “take every 
one of the collection’s indices that are in bounds”.  Can you elaborate?

let range = 0..<20
let array = [1, 2, 3]
let slice = array[range] // trap on index out of bounds

> These are not the same thing.  A collection's indices need not include every 
> expressible value of the Index type between startIndex and endIndex.

Sure, but it does appear as if the behavior of slicing assumes that the upper 
and lower bounds of the range provided are valid indices.  

Xiaodi had convinced me that a one-sided range must take a position on whether 
or not it has an infinite upper bound in order to make sense, but now you’ve 
changed my mind back.

> 
> The whole point of the name RangeExpression is to acknowledge this truth: 
> ranges in Swift bits of syntax whose meaning is given partly by how they are 
> used.  

This makes sense and is roughly the position I started with.  I should have 
read through the archives of this thread more before jumping into the middle of 
the discussion - I was missing some important context.  I apologize for not 
doing so.

The name `RangeExpression` does a good job of indicating why it’s ok for 
instances to sometimes behave as if they have an infinite upper bound and other 
times not depending on context.


> In fact, now that I say it, in that respect ranges are not all that different 
> any other type: the meaning of a Double or an Array or a Bool is also 
> interpreted by the methods to which it is passed, and can have completely 
> different results depending on that context.
> 
> chillaxing-ly y'rs,
> 
> Dave

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Nevin Brackett-Rozinsky via swift-evolution
So, not to be “that guy”, but does anyone recall what the status of the
actual *String* revamp for Swift 4 currently is?

There was a lot of good discussion on the matter, and I want to make sure
it hasn’t gotten lost or dropped.

Nevin
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Jonathan Hull via swift-evolution
Just out of curiosity, what are the use-cases for an infinite sequence (as 
opposed to a sequence which is bounded to the type’s representable values)?

Thanks,
Jon

> On Feb 2, 2017, at 7:52 AM, Dave Abrahams via swift-evolution 
>  wrote:
> 
> 
> 
> Sent from my iPad
> 
> On Feb 2, 2017, at 7:32 AM, Matthew Johnson  > wrote:
> 
>> 
>>> On Feb 2, 2017, at 6:07 AM, Brent Royal-Gordon via swift-evolution 
>>> > wrote:
>>> 
 On Feb 2, 2017, at 3:06 AM, Jaden Geller via swift-evolution 
 > wrote:
 
 It's not infinite (else subscript would trap)
>>> 
>>> I'm not necessarily a fan of the idea of allowing you to iterate over an 
>>> `IncompleteRange`, but I have to ask: What do you imagine an infinite 
>>> sequence of integers would do when it tried to go past the `max` value? As 
>>> far as I can tell, trapping is the *only* sensible possibility.
>> 
>> I don’t think anyone is disputing this right now.  The discussion is whether 
>> `IncompleteRange` and `InfiniteRange` are distinct concepts which should be 
>> modeled or whether they can be adequately represented by a single type.
>> 
>> In order to iterate a range you must know both bounds (even if one is 
>> infinite).  When we have a one-sided range with a bound that is countable 
>> and we allow it to conform to Sequence we are implicitly acknowledging it is 
>> an infinite range rather than an “incomplete” range.
>> 
>> If you have a range with an infinite upper bound (i.e. a one-sided range 
>> with a countable Bound) and apply the usual semantics of a collection 
>> subscript operation the result would necessarily trap because the upper 
>> bound is out of bounds.
>> 
>> We obviously don’t want this behavior.  Instead we want the upper bound to 
>> be clamped to the index preceding `endIndex` as a part of the subscript 
>> operation. For an infinite range this is equivalent to a very special case 
>> clamping behavior.  Special case clamping behavior like this is questionable 
>> on its own, and especially questionable if we ever add `InfiniteCollection` 
>> (which Ben mentioned in a footnote to his post) where subscripting with an 
>> infinite range would be a perfectly valid operation that produces an 
>> infinite slice.
>> 
>> If instead, we have a distinct type for `IncompleteRange` we don’t need a 
>> subscript overload with this kind of special case behavior.  There would not 
>> be a subscript that accepts `InfiniteRange` at all (for now - if we add 
>> `InfiniteCollection` later it probably *would* have one).  Instead, we would 
>> have a subscript that accepts `IncompleteRange` with the obvious semantics 
>> of filling in the missing bound with the last valid index (or `startIndex` 
>> if we also support incomplete ranges that only specify an upper bound).
> 
> The difference between Range and CountableRange (which I'm desperate to 
> eliminate using conditional conformances) has already been a source of deep 
> frustration for many users.  From a pure usability standpoint the idea of 
> creating more distinctions in the type system between similar ranges is 
> unfathomable to me.  Doing so on grounds like those described above seems 
> like it would represent a blatant case of theoretical purity winning out over 
> practical considerations, which runs counter to the spirit of Swift.
> 
> --Dave
> ___
> swift-evolution mailing list
> swift-evolution@swift.org 
> https://lists.swift.org/mailman/listinfo/swift-evolution 
> 
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution


Sent from my iPad

> On Feb 2, 2017, at 7:32 AM, Matthew Johnson  wrote:
> 
> 
>>> On Feb 2, 2017, at 6:07 AM, Brent Royal-Gordon via swift-evolution 
>>>  wrote:
>>> 
>>> On Feb 2, 2017, at 3:06 AM, Jaden Geller via swift-evolution 
>>>  wrote:
>>> 
>>> It's not infinite (else subscript would trap)
>> 
>> I'm not necessarily a fan of the idea of allowing you to iterate over an 
>> `IncompleteRange`, but I have to ask: What do you imagine an infinite 
>> sequence of integers would do when it tried to go past the `max` value? As 
>> far as I can tell, trapping is the *only* sensible possibility.
> 
> I don’t think anyone is disputing this right now.  The discussion is whether 
> `IncompleteRange` and `InfiniteRange` are distinct concepts which should be 
> modeled or whether they can be adequately represented by a single type.
> 
> In order to iterate a range you must know both bounds (even if one is 
> infinite).  When we have a one-sided range with a bound that is countable and 
> we allow it to conform to Sequence we are implicitly acknowledging it is an 
> infinite range rather than an “incomplete” range.
> 
> If you have a range with an infinite upper bound (i.e. a one-sided range with 
> a countable Bound) and apply the usual semantics of a collection subscript 
> operation the result would necessarily trap because the upper bound is out of 
> bounds.
> 
> We obviously don’t want this behavior.  Instead we want the upper bound to be 
> clamped to the index preceding `endIndex` as a part of the subscript 
> operation. For an infinite range this is equivalent to a very special case 
> clamping behavior.  Special case clamping behavior like this is questionable 
> on its own, and especially questionable if we ever add `InfiniteCollection` 
> (which Ben mentioned in a footnote to his post) where subscripting with an 
> infinite range would be a perfectly valid operation that produces an infinite 
> slice.
> 
> If instead, we have a distinct type for `IncompleteRange` we don’t need a 
> subscript overload with this kind of special case behavior.  There would not 
> be a subscript that accepts `InfiniteRange` at all (for now - if we add 
> `InfiniteCollection` later it probably *would* have one).  Instead, we would 
> have a subscript that accepts `IncompleteRange` with the obvious semantics of 
> filling in the missing bound with the last valid index (or `startIndex` if we 
> also support incomplete ranges that only specify an upper bound).

The difference between Range and CountableRange (which I'm desperate to 
eliminate using conditional conformances) has already been a source of deep 
frustration for many users.  From a pure usability standpoint the idea of 
creating more distinctions in the type system between similar ranges is 
unfathomable to me.  Doing so on grounds like those described above seems like 
it would represent a blatant case of theoretical purity winning out over 
practical considerations, which runs counter to the spirit of Swift.

--Dave___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Dave Abrahams via swift-evolution


Sent from my iPad

> On Feb 2, 2017, at 7:11 AM, Matthew Johnson  wrote:
> 
>>> 
>>> 
>>> Furthermore, we emphatically do *not* need to make the distinction you 
>>> claim between “infinite” and “incomplete” ranges, which *is* needless 
>>> hairsplitting.
>> 
>> Strongly disagree, unless you can describe the semantics of the type WITHOUT 
>> giving it different semantics depending on how it is used.
> 
> This is the point that convinced me.  I’m going to take a closer look at 
> Brent’s `RangeExpression` design which I must admit I only skimmed in the 
> earlier discussion.

We already have exactly this situation with CountableRange (which will merge 
with Range when conditional conformances land).  When used as a Collection, it 
means "every index value starting with the lowerBound and ending just before 
the upperBound".  When used for slicing, it means, roughly, "take every one of 
the collection's indices that are in bounds."  These are not the same thing.  A 
collection's indices need not include every expressible value of the Index type 
between startIndex and endIndex.

The whole point of the name RangeExpression is to acknowledge this truth: 
ranges in Swift bits of syntax whose meaning is given partly by how they are 
used.  In fact, now that I say it, in that respect ranges are not all that 
different any other type: the meaning of a Double or an Array or a Bool 
is also interpreted by the methods to which it is passed, and can have 
completely different results depending on that context.

chillaxing-ly y'rs,

Dave___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Matthew Johnson via swift-evolution

> On Feb 2, 2017, at 6:07 AM, Brent Royal-Gordon via swift-evolution 
>  wrote:
> 
>> On Feb 2, 2017, at 3:06 AM, Jaden Geller via swift-evolution 
>>  wrote:
>> 
>> It's not infinite (else subscript would trap)
> 
> I'm not necessarily a fan of the idea of allowing you to iterate over an 
> `IncompleteRange`, but I have to ask: What do you imagine an infinite 
> sequence of integers would do when it tried to go past the `max` value? As 
> far as I can tell, trapping is the *only* sensible possibility.

I don’t think anyone is disputing this right now.  The discussion is whether 
`IncompleteRange` and `InfiniteRange` are distinct concepts which should be 
modeled or whether they can be adequately represented by a single type.

In order to iterate a range you must know both bounds (even if one is 
infinite).  When we have a one-sided range with a bound that is countable and 
we allow it to conform to Sequence we are implicitly acknowledging it is an 
infinite range rather than an “incomplete” range.

If you have a range with an infinite upper bound (i.e. a one-sided range with a 
countable Bound) and apply the usual semantics of a collection subscript 
operation the result would necessarily trap because the upper bound is out of 
bounds.

We obviously don’t want this behavior.  Instead we want the upper bound to be 
clamped to the index preceding `endIndex` as a part of the subscript operation. 
For an infinite range this is equivalent to a very special case clamping 
behavior.  Special case clamping behavior like this is questionable on its own, 
and especially questionable if we ever add `InfiniteCollection` (which Ben 
mentioned in a footnote to his post) where subscripting with an infinite range 
would be a perfectly valid operation that produces an infinite slice.

If instead, we have a distinct type for `IncompleteRange` we don’t need a 
subscript overload with this kind of special case behavior.  There would not be 
a subscript that accepts `InfiniteRange` at all (for now - if we add 
`InfiniteCollection` later it probably *would* have one).  Instead, we would 
have a subscript that accepts `IncompleteRange` with the obvious semantics of 
filling in the missing bound with the last valid index (or `startIndex` if we 
also support incomplete ranges that only specify an upper bound).

> 
> (If you used a `BigInt` type, the sequence could of course then be infinite, 
> or as infinite as memory allows.)
> 
> -- 
> Brent Royal-Gordon
> Architechies
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Jonathan Hull via swift-evolution
I really like the IncompleteRange/RangeExpression idea!

I don’t think that IncompleteRange/RangeExpression should, by themselves, 
conform to Sequence. It seems like necessary information is missing there.  
Instead, there needs to be a conditional conformance to Sequence based on 
another protocol that provides the natural bounds for the Bound type.

For example, what if we have another protocol:

protocol FiniteComparable : Comparable { //Any finite set which is 
comparable will have a lowest value and a highest value...
static var lowestValue:Self {get}
static var highestValue:Self {get}
}

Something like UInt would have a lowestValue of 0 and highestValue of (UInt.max 
-1).  Then you could conditionally conform a RangeExpression where the Bounds 
are FiniteComparable to Sequence.

Now the behavior is consistent.  In the case of ‘array[0…]’ the array is 
providing the missing upper bound.  In the case of ‘for n in 0…’ the 
conformance to FiniteComparable is providing the bound (and it doesn’t trap, 
because it is enumerating all values IN that type above the lower bound).

for n in UInt8(0)… {/* Will get called for every possible value of 
UInt8 */}


I agree that trapping when an infinite sequence of integers goes past the max 
value is the only reasonable thing to do in that situation... but since we get 
to define the bounds (and we have defined them in other cases to be the largest 
usable value), why not define them the same way here (i.e. not infinite).  In 
this case, I don’t see the added value in making the sequence infinite instead 
of just bounded by what the type can represent.  The only thing it seems it 
adds is the trapping behavior.  With the natural bound, you can use things like 
filter on partially defined ranges (which would trap if they are defined as 
infinite):

let odd:[UInt8] = (0…).filter({$0 & 1 != 0}) //returns an array of all 
the odd UInt8

In cases where the bound type doesn’t conform to FiniteComparable, it could 
still be used as a RangeExpression, but not as a sequence.

Thanks,
Jon


> On Feb 2, 2017, at 4:07 AM, Brent Royal-Gordon via swift-evolution 
>  wrote:
> 
>> On Feb 2, 2017, at 3:06 AM, Jaden Geller via swift-evolution 
>>  wrote:
>> 
>> It's not infinite (else subscript would trap)
> 
> I'm not necessarily a fan of the idea of allowing you to iterate over an 
> `IncompleteRange`, but I have to ask: What do you imagine an infinite 
> sequence of integers would do when it tried to go past the `max` value? As 
> far as I can tell, trapping is the *only* sensible possibility.
> 
> (If you used a `BigInt` type, the sequence could of course then be infinite, 
> or as infinite as memory allows.)
> 
> -- 
> Brent Royal-Gordon
> Architechies
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-02 Thread Brent Royal-Gordon via swift-evolution
> On Feb 2, 2017, at 3:06 AM, Jaden Geller via swift-evolution 
>  wrote:
> 
> It's not infinite (else subscript would trap)

I'm not necessarily a fan of the idea of allowing you to iterate over an 
`IncompleteRange`, but I have to ask: What do you imagine an infinite sequence 
of integers would do when it tried to go past the `max` value? As far as I can 
tell, trapping is the *only* sensible possibility.

(If you used a `BigInt` type, the sequence could of course then be infinite, or 
as infinite as memory allows.)

-- 
Brent Royal-Gordon
Architechies

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Brent Royal-Gordon via swift-evolution
> On Feb 1, 2017, at 2:30 PM, Nate Cook via swift-evolution 
>  wrote:
> 
> With a lot of these new features, it helps me greatly to see them in action. 
> I've built a poor man's version of these incomplete ranges in a Swift Sandbox 
> here:
>   http://swiftlang.ng.bluemix.net/#/repl/58925f5d42b65e6dce9a5bea
> 
> This implementation suffers greatly from a lack of generic subscripts, and 
> the type names are terrible and not at all suggestions. Otherwise, as far as 
> I can tell, the behavior of the one-sided ranges correctly matches what Ben 
> is describing here — if you're unsure about how this will look and behave in 
> practice, please take a look.

I actually have a pull request with a partially finished implementation: 


Unfortunately, that pull request is full of GYB garbage, rips out a lot of 
code, and doesn't merge cleanly into the current master, so I've extracted much 
of it and adapted it into a playground that people can try out. 

 

The actual implementation is in the "Sources" folder. The most important 
difference from your prototype is that the types are designed quite 
differently. The operators create instances of types which, at their core, look 
like these:

public struct IncompleteRange {
public let lowerBound: Bound?
public let upperBound: Bound?
}
public struct IncompleteClosedRange {
public let lowerBound: Bound?
public let upperBound: Bound?
}

Thus, while the normal case is that only one bound will be filled in, you could 
create an `IncompleteRange` with both bounds or neither bound filled in.

This prototype is based on a slightly earlier design, so it has a few 
differences from the most popular designs we've discussed:

1.  An `IncompleteRange` with a missing upper bound is spelled `i..<`, an 
`IncompleteClosedRange` with a missing upper bound is `i...`. That ends up 
meaning that `ary[i...]` will almost certainly crash, and you need to use 
`ary[i..<]`.

2.  This also includes infix `..<` and `...` operators, which vary from the 
conventional ones because they accept optional bounds. That allows for dynamic 
construction.

3.  The types include `completed(by:)` methods which take ranges to fill 
their bounds from. In other words, these are truly modeled as *incomplete* 
ranges that you're expected to fill in, not *unbounded* ranges that are 
infinite; 

4.  This includes a number of workarounds and hacks we'd like to ultimately 
get rid of.

5.  This does *not* include `contains(_:)` or `~=` implementations.

Obviously these points can be debated and corrected; you could even pop into 
the "Sources" directory and fix them yourselves if you want to try them out.

Hope this helps,
-- 
Brent Royal-Gordon
Architechies

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Nate Cook via swift-evolution
With a lot of these new features, it helps me greatly to see them in action. 
I've built a poor man's version of these incomplete ranges in a Swift Sandbox 
here:
http://swiftlang.ng.bluemix.net/#/repl/58925f5d42b65e6dce9a5bea

This implementation suffers greatly from a lack of generic subscripts, and the 
type names are terrible and not at all suggestions. Otherwise, as far as I can 
tell, the behavior of the one-sided ranges correctly matches what Ben is 
describing here — if you're unsure about how this will look and behave in 
practice, please take a look.

Nate


> On Feb 1, 2017, at 10:37 AM, Ben Cohen via swift-evolution 
>  wrote:
> 
> 
> I think Dave has already made these points separately but it probably helps 
> to recap more explicitly the behavior we expect from “x…"
> 
> Names are just for discussion purposes, exact naming can be left as a 
> separate discussion from functionality:
> 
> - Informally, one-sided ranges are a thing. Formally, there are lower-bounded 
> one-sided ranges, which are created with a postfix ... operator. For now, 
> let’s just fully understand them and come back to upper-bounded ranges later.
> - Collections will have a subscript that takes a one-sided range and returns 
> a SubSequence. The behavior of that subscript is that the collection "fills 
> in" the “missing” side with it’s upper/lower bound and uses that two-sided 
> range to return a slice.*
> - When the Bound type of a lower-bounded range is countable, it will conform 
> to Sequence.** The behavior of that sequence’s iterator is that it starts at 
> the lower bound, and increments indefinitely.
> - One-sided ranges should have ~= defined for use with switch statements, 
> where any value above the bound for a lower-bounded range would return true.
> 
> * implementation detail: collections would probably have a generic subscript 
> taking a type conforming to RangeExpression, and that protocol would have a 
> helper extension for filling in the missing range given a collection
> **one-sided ranges ought in fact to be infinite Collections… a concept that 
> needs a separate thread
> 
> With that defined:
> 
>> On Feb 1, 2017, at 5:02 AM, Xiaodi Wu via swift-evolution 
>> > wrote:
>> 
>> I entirely agree with you on the desired behavior of `zip(...)`.
>> 
>> However, if you insist on 0... being notionally an infinite range, then you 
>> would have to insist on `for i in 0...` also trapping. Which is not a big 
>> deal, IMO, but will surely make the anti-trapping crowd upset.
>> 
> 
> Certainly, for i in 0… would trap once the iterator needs to increment past 
> Int.max. If you break out of the loop before this happens, it won’t trap. The 
> statement is the moral equivalent of a C-style for(i = 0; /*nothing*/ ; ++i).
> 
> If the anti-trapping crowd are upset, they should be upset with Int, not this 
> range type. The range has no opinion on trapping – its iterator just 
> increments its Bounds type when asked to.
> 
>> The bigger issue is that either you must have a lenient/clamping subscript 
>> for `arr[0...]` or it too must trap, which is not desired. 
> 
> Based on the definition I give above, arr[0…] means “from 0 up to the 
> endIndex of arr”. This will not trap.
> 
> (there is a legitimate quibble here that this will translate into 
> arr[0.. invalid. But 0..< is ugly so we should ignore this quibble for the sake of 
> aesthetics)
> 
>> However, if `arr[0...]` is clamping, then `[1, 2, 3][100...]` would not trap 
>> and instead give you `[]`.
>> 
> 
> It should trap, because 100.. slicing this array.
> 
>> If 0... is regarded as an incomplete range, your example of `zip(...)` could 
>> still trap as desired. It would trap on the notional attempt to assign 
>> someArray.count to IncompleteRange.inferredUpperBound if count exceeds 
>> T.max. 
> 
> It’s unclear to me whether you are trying to formally define 
> .inferredUpperBound as a property you expect to exist, or if you’re using it 
> as informal shorthand. But there is no implied upper bound to a one-sided 
> lower-bounded range, only an actual lower bound. Operations taking 
> lower-bounded ranges as arguments can infer their own upper bound from 
> context, or not, depending on the functionality they need.
> 
> As an example of an alternative inferred upper bound: suppose you want to 
> define your own ordering for ranges, for sorting/display purposes. You decide 
> on a lexicographic ordering first by lower then upper bound. You want 
> one-sided ranges to fit into this ordering. So you infer the upper bound to 
> be infinite, for sorting purposes, so: 0…5 < 0… < 1…4 < 1…5 < 1…. This does 
> not mean the upper bound is implied to be infinite, just that the sorting 
> predicate infers it to be, for the purpose of sorting.
> 
>> With such semantics for 0..., [1, 2, 3][0...] would 

Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Ben Cohen via swift-evolution

I think Dave has already made these points separately but it probably helps to 
recap more explicitly the behavior we expect from “x…"

Names are just for discussion purposes, exact naming can be left as a separate 
discussion from functionality:

- Informally, one-sided ranges are a thing. Formally, there are lower-bounded 
one-sided ranges, which are created with a postfix ... operator. For now, let’s 
just fully understand them and come back to upper-bounded ranges later.
- Collections will have a subscript that takes a one-sided range and returns a 
SubSequence. The behavior of that subscript is that the collection "fills in" 
the “missing” side with it’s upper/lower bound and uses that two-sided range to 
return a slice.*
- When the Bound type of a lower-bounded range is countable, it will conform to 
Sequence.** The behavior of that sequence’s iterator is that it starts at the 
lower bound, and increments indefinitely.
- One-sided ranges should have ~= defined for use with switch statements, where 
any value above the bound for a lower-bounded range would return true.

* implementation detail: collections would probably have a generic subscript 
taking a type conforming to RangeExpression, and that protocol would have a 
helper extension for filling in the missing range given a collection
**one-sided ranges ought in fact to be infinite Collections… a concept that 
needs a separate thread

With that defined:

> On Feb 1, 2017, at 5:02 AM, Xiaodi Wu via swift-evolution 
>  wrote:
> 
> I entirely agree with you on the desired behavior of `zip(...)`.
> 
> However, if you insist on 0... being notionally an infinite range, then you 
> would have to insist on `for i in 0...` also trapping. Which is not a big 
> deal, IMO, but will surely make the anti-trapping crowd upset.
> 

Certainly, for i in 0… would trap once the iterator needs to increment past 
Int.max. If you break out of the loop before this happens, it won’t trap. The 
statement is the moral equivalent of a C-style for(i = 0; /*nothing*/ ; ++i).

If the anti-trapping crowd are upset, they should be upset with Int, not this 
range type. The range has no opinion on trapping – its iterator just increments 
its Bounds type when asked to.

> The bigger issue is that either you must have a lenient/clamping subscript 
> for `arr[0...]` or it too must trap, which is not desired.

Based on the definition I give above, arr[0…] means “from 0 up to the endIndex 
of arr”. This will not trap.

(there is a legitimate quibble here that this will translate into 
arr[0.. However, if `arr[0...]` is clamping, then `[1, 2, 3][100...]` would not trap 
> and instead give you `[]`.
> 

It should trap, because 100.. If 0... is regarded as an incomplete range, your example of `zip(...)` could 
> still trap as desired. It would trap on the notional attempt to assign 
> someArray.count to IncompleteRange.inferredUpperBound if count exceeds 
> T.max.

It’s unclear to me whether you are trying to formally define 
.inferredUpperBound as a property you expect to exist, or if you’re using it as 
informal shorthand. But there is no implied upper bound to a one-sided 
lower-bounded range, only an actual lower bound. Operations taking 
lower-bounded ranges as arguments can infer their own upper bound from context, 
or not, depending on the functionality they need.

As an example of an alternative inferred upper bound: suppose you want to 
define your own ordering for ranges, for sorting/display purposes. You decide 
on a lexicographic ordering first by lower then upper bound. You want one-sided 
ranges to fit into this ordering. So you infer the upper bound to be infinite, 
for sorting purposes, so: 0…5 < 0… < 1…4 < 1…5 < 1…. This does not mean the 
upper bound is implied to be infinite, just that the sorting predicate infers 
it to be, for the purpose of sorting.

> With such semantics for 0..., [1, 2, 3][0...] would behave as expected 
> without the need for leniency, but [1, 2, 3][100...] would trap as I assume 
> you'd expect.

[1,2,3][0…] is valid because the array has an index of 0, but [100…] isn’t 
because it doesn’t.

It’s worth noting the difference with Python here. In Python, [][2:] is valid 
because [][2:n] is valid (both return []). In Swift, [][2…] should be invalid 
(trap) because [][2.. However, it would make no sense to write `for i in 0...`.
> 

I don’t see how. But regardless, the definition of how countable lower-bounded 
ranges conform to Sequence is just something to be defined in some way that is 
useful and intuitive to users. 0… being an indefinitely increasing sequence 
seems like it’s intuitive to me, as does a[n…] being “a slice from n to the 
end”.

That definition does not need to be unearthed based on some underlying 
principles. Defining 

Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Nevin Brackett-Rozinsky via swift-evolution
I had read the discussion live as it happened. And just now I went back to
see Jaden’s posts again: there are only a handful, and all are quite brief.
To the extent that they “sketched a direction”, I would say no. We should
instead base the proposal on ideas laid out by Dave Abrahams, Brent
Royal-Gordon, Ben Cohen and several others.

Furthermore, we emphatically do *not* need to make the distinction you
claim between “infinite” and “incomplete” ranges, which *is* needless
hairsplitting.

We (meaning Swift Evolution) can define any semantics we like for any
operator we like. The simple, elegant, intuitive behavior for one-sided
ranges is exactly the “do what I mean” approach as described by many people
including Dave Abrahams.

Want to subscript a collection using a one-sided range? Great! If the fixed
bound is inside the collection then go as far as possible (ie. from start
or to end), and if it is outside then trap for index out of bounds.

Want to zip together integers and sequence elements? Great! If the sequence
eventually ends then stop when it does, and if not then trap when the
integer overflows.

Want to iterate over a one-sided range? Well if the upper end is open then
great! “for i in 20...” will loop until it hits a “break” or trap on
overflow. We could probably even make “for i in (...20).reversed” work and
count down, though we don’t have to.

In any case, the point remains: if we do add one-sided ranges, we can
define any behavior we want for them. And given the opinionated nature of
Swift, it follows that we should choose to make them expressive, useful,
and enjoyable.

Nevin


On Wed, Feb 1, 2017 at 10:58 AM, Matthew Johnson 
wrote:

>
> > On Feb 1, 2017, at 9:52 AM, Nevin Brackett-Rozinsky <
> nevin.brackettrozin...@gmail.com> wrote:
> >
> > Drafting a proposal sounds like a good idea, to establish all the
> relevant information in one place. I don’t recall off the top of my head
> what directions Jaden sketched out, but as long as the proposal hits the
> high points of the uses and benefits, and summarizes the discussion and
> alternatives, it should be fine.
> >
> > I might suggest using Chris’s terminology of “one-sided range”, because
> that is both more precise and it renders moot all the “incomplete” vs
> “infinite” hairsplitting.
>
> I recommend reading through the discussion Xiaodi and I had yesterday,
> which Jaden chimed in on.  We really do need to make a distinction between
> incomplete and infinite ranges if we want to support all of the use cases
> with clean semantics.  This isn’t hair splitting.
>
> >
> > Nevin
>
>
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Nevin Brackett-Rozinsky via swift-evolution
Drafting a proposal sounds like a good idea, to establish all the relevant
information in one place. I don’t recall off the top of my head what
directions Jaden sketched out, but as long as the proposal hits the high
points of the uses and benefits, and summarizes the discussion and
alternatives, it should be fine.

I might suggest using Chris’s terminology of “one-sided range”, because
that is both more precise and it renders moot all the “incomplete” vs
“infinite” hairsplitting.

Nevin
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Matthew Johnson via swift-evolution

> On Feb 1, 2017, at 9:52 AM, Nevin Brackett-Rozinsky 
>  wrote:
> 
> Drafting a proposal sounds like a good idea, to establish all the relevant 
> information in one place. I don’t recall off the top of my head what 
> directions Jaden sketched out, but as long as the proposal hits the high 
> points of the uses and benefits, and summarizes the discussion and 
> alternatives, it should be fine.
> 
> I might suggest using Chris’s terminology of “one-sided range”, because that 
> is both more precise and it renders moot all the “incomplete” vs “infinite” 
> hairsplitting.

I recommend reading through the discussion Xiaodi and I had yesterday, which 
Jaden chimed in on.  We really do need to make a distinction between incomplete 
and infinite ranges if we want to support all of the use cases with clean 
semantics.  This isn’t hair splitting.

> 
> Nevin

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Matthew Johnson via swift-evolution

> On Feb 1, 2017, at 9:15 AM, Nevin Brackett-Rozinsky 
>  wrote:
> 
> I am also +1.
> 
> 
> On Wed, Feb 1, 2017 at 9:29 AM, Matthew Johnson via swift-evolution 
> > wrote:
> 
> I’m still curious how postfix `…` would impact our options for variadic 
> generics and tuple unpacking in the future.
> 
> 
> Somebody who happens to have originally created Swift addressed this point 
> last week:
> 
> 
> On Wed, Jan 25, 2017 at 8:49 PM, Chris Lattner via swift-evolution 
> > wrote:
> 
> In any case, it seems like an obviously good tradeoff to make the syntax for 
> variadic generics more complicated if it makes one sided ranges more 
> beautiful.
> 
> -Chris

Thanks for reminding me of this.  I generally agree with Chris, but have no 
idea what the more complicated syntax for variadic generics might look like.  I 
guess what I’m looking for is some indication of what ideas (if any) there are 
about what this might look like.  A sketch of possible directions would be 
sufficient to answer the questions lurking in the back of my mind.

> 
> 
> I think we should start a new thread for the discussion of incomplete ranges 
> though.

Yes, I agree.  They have been discussed in two different threads now and it 
really feels like incomplete (and infinite) ranges deserve discussion in their 
own right.  Would it make sense to draft a proposal based on the direction 
Jaden sketched as a point of focus for the thread (especially to help newcomers 
get up to speed)?

> 
> Nevin
> 
> 
> On Wed, Feb 1, 2017 at 9:29 AM, Matthew Johnson via swift-evolution 
> > wrote:
> 
> > On Feb 1, 2017, at 6:58 AM, Brent Royal-Gordon via swift-evolution 
> > > wrote:
> >
> >> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu via swift-evolution 
> >> > wrote:
> >>
> >> Therefore I'd conclude that `arr[upTo: i]` is the most consistent 
> >> spelling. It also yields the sensible result that `arr[from: i][upTo: j] 
> >> == arr[upTo: j][from: i] == arr[i.. >
> > There's a lot I dislike about `subscript(upTo/through/from:)`:
> >
> > 1. We have not previously been very satisfied with how understandable these 
> > labels are—for instance, we fiddled around with them a lot when we were 
> > looking at `stride(from:to/through:by:)` in Swift 3, and eventually settled 
> > on the originals because we couldn't find anything better. I don't think 
> > entrenching them further makes very much sense.
> >
> > 2. The fact that you *can* write `arr[from: i][upTo: j]`, and that this is 
> > equivalent to both `arr[upTo: j][from: i]` and `arr[i.. > weird. We aren't typically in the habit of providing redundant APIs like 
> > this.
> >
> > 3. Neither Stdlib nor the Apple frameworks currently contain *any* labeled 
> > subscripts, so this design would be unprecedented in the core language.
> >
> > 4. After a new programmer learns about subscripting with two-sided ranges, 
> > removing one of the bounds is a straightforward extension of what they 
> > already know. The argument label solution is more ad-hoc.
> >
> > 5. The argument label solution solves the immediate problem, but doesn't 
> > give us anything else.
> >
> > To understand what I mean by #5, consider the implementation. The plan is 
> > to introduce a `RangeExpression` protocol:
> >
> >   protocol RangeExpression {
> >   associatedtype Bound: Comparable
> >   func relative > == Bound -> Range
> >   }
> >
> > And then reduce the many manually-generated variants of `subscript(_: 
> > Range)` in `Collection` to just two:
> >
> >   protocol Collection {
> >   ...
> >   subscript(bounds: Range) -> SubSequence { get }
> >   ...
> >   }
> >
> >   extension Collection {
> >   ...
> >   subscript(bounds: Bounds) where 
> > Bounds.Bound == Index -> SubSequence {
> >   return self[bounds.relative(to: self)]
> >   }
> >   ...
> >   }
> >
> > This design would automatically, source-compatibly, handle several 
> > different existing types you can slice with:
> >
> > * ClosedRange
> > * CountableRange
> > * CountableClosedRange
> >
> > Plus the new types associated with incomplete ranges:
> >
> > * IncompleteRange
> > * IncompleteClosedRange
> >
> > Plus anything else we, or users, might want to add. For instance, I have a 
> > prototype built on `RangeExpression` which lets you write things like:
> >
> >   myString[.startIndex + 1 ..< .endIndex - 1]
> >
> > This strikes me as a pretty cool thing that some people might want.
> >
> > Similarly, IncompleteRange and IncompleteClosedRange can most likely be put 
> > 

Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Nevin Brackett-Rozinsky via swift-evolution
I am also +1.


On Wed, Feb 1, 2017 at 9:29 AM, Matthew Johnson via swift-evolution <
swift-evolution@swift.org> wrote:

>
> I’m still curious how postfix `…` would impact our options for variadic
> generics and tuple unpacking in the future.



Somebody who happens to have originally created Swift addressed this point
last week:


On Wed, Jan 25, 2017 at 8:49 PM, Chris Lattner via swift-evolution <
swift-evolution@swift.org> wrote:

>
> In any case, it seems like an obviously good tradeoff to make the syntax
> for variadic generics more complicated if it makes one sided ranges more
> beautiful.
>
> -Chris
>


I think we should start a new thread for the discussion of incomplete
ranges though.

Nevin


On Wed, Feb 1, 2017 at 9:29 AM, Matthew Johnson via swift-evolution <
swift-evolution@swift.org> wrote:

>
> > On Feb 1, 2017, at 6:58 AM, Brent Royal-Gordon via swift-evolution <
> swift-evolution@swift.org> wrote:
> >
> >> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu via swift-evolution <
> swift-evolution@swift.org> wrote:
> >>
> >> Therefore I'd conclude that `arr[upTo: i]` is the most consistent
> spelling. It also yields the sensible result that `arr[from: i][upTo: j] ==
> arr[upTo: j][from: i] == arr[i.. >
> > There's a lot I dislike about `subscript(upTo/through/from:)`:
> >
> > 1. We have not previously been very satisfied with how understandable
> these labels are—for instance, we fiddled around with them a lot when we
> were looking at `stride(from:to/through:by:)` in Swift 3, and eventually
> settled on the originals because we couldn't find anything better. I don't
> think entrenching them further makes very much sense.
> >
> > 2. The fact that you *can* write `arr[from: i][upTo: j]`, and that this
> is equivalent to both `arr[upTo: j][from: i]` and `arr[i.. weird. We aren't typically in the habit of providing redundant APIs like
> this.
> >
> > 3. Neither Stdlib nor the Apple frameworks currently contain *any*
> labeled subscripts, so this design would be unprecedented in the core
> language.
> >
> > 4. After a new programmer learns about subscripting with two-sided
> ranges, removing one of the bounds is a straightforward extension of what
> they already know. The argument label solution is more ad-hoc.
> >
> > 5. The argument label solution solves the immediate problem, but doesn't
> give us anything else.
> >
> > To understand what I mean by #5, consider the implementation. The plan
> is to introduce a `RangeExpression` protocol:
> >
> >   protocol RangeExpression {
> >   associatedtype Bound: Comparable
> >   func relative C.Index == Bound -> Range
> >   }
> >
> > And then reduce the many manually-generated variants of `subscript(_:
> Range)` in `Collection` to just two:
> >
> >   protocol Collection {
> >   ...
> >   subscript(bounds: Range) -> SubSequence { get }
> >   ...
> >   }
> >
> >   extension Collection {
> >   ...
> >   subscript(bounds: Bounds) where
> Bounds.Bound == Index -> SubSequence {
> >   return self[bounds.relative(to: self)]
> >   }
> >   ...
> >   }
> >
> > This design would automatically, source-compatibly, handle several
> different existing types you can slice with:
> >
> > * ClosedRange
> > * CountableRange
> > * CountableClosedRange
> >
> > Plus the new types associated with incomplete ranges:
> >
> > * IncompleteRange
> > * IncompleteClosedRange
> >
> > Plus anything else we, or users, might want to add. For instance, I have
> a prototype built on `RangeExpression` which lets you write things like:
> >
> >   myString[.startIndex + 1 ..< .endIndex - 1]
> >
> > This strikes me as a pretty cool thing that some people might want.
> >
> > Similarly, IncompleteRange and IncompleteClosedRange can most likely be
> put to other uses. They could easily fill a gap in `switch` statements,
> which don't have a good way to express open-ended comparisons except with a
> `where` clause. As some have mentioned, when applied to a `Strideable` type
> they *could* be treated as infinite sequences, although it's not clear if
> we really want to do that. And, you know, sometimes you really *do* have a
> case where one or both bounds of a range may be missing in some cases;
> incomplete ranges are a built-in, batteries-included way to model that.
> >
> > To put it simply, slicing with incomplete ranges gives us several
> valuable tools we can apply to other problems. Labeled subscripts, on the
> other hand, are just another weird little thing that you have to memorize,
> and probably won’t.
>
> +1 in general.  But I’m still curious how postfix `…` would impact our
> options for variadic generics and tuple unpacking in the future.
>
> >
> > --
> > Brent Royal-Gordon
> > Architechies
> >
> > ___
> > swift-evolution mailing list
> > swift-evolution@swift.org
> > 

Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Xiaodi Wu via swift-evolution
Brent--

In general, I agree with the direction of your thinking. Time constraints
prevent me from responding as thoroughly as I'd like, but briefly, where we
don't agree at the moment boil down to a few points:

* Agree that "to" vs "through" is not ideal. However, having spent a lot of
time on this issue, I'm quite convinced there is no single word in English
that will accurately describe the distinction. Certainly, we can mimic the
English language by doing something like `stride(from: a, to: b,
.exclusive, by: c)`. However, one other solution to provide clarity is to
pick a convention and solidify user understanding by frequent and
consistent use, which is the diametric opposite of your stance that we
shouldn't use the convention elsewhere.

In general, I think we contort ourselves in the wrong way when the first
listed motivation for a new feature is to work around a naming difficulty:
the solution to not having found the right name is to find the right name.

* It is not redundant API simply because the same thing can be achieved by
composing two other functions. One must assess ergonomics, footgun
likelihood, etc. But in any case, the point you make here is equally
applicable to your proposed feature, i.e. arr[0...][..<2] == arr[0..<2].

* I fully expect subscript labels to start making an appearance in stdlib
APIs sooner or later. I don't think its absence is indicative of a
conscious rejection of them, just that they haven't been needed. (Lenient
subscripts will one day make an appearance, given how often it's requested,
and I sure hope they are labeled.)

* Disagree that the labeled subscript is ad-hoc. In the contrary, I think
"incomplete ranges" are the more ad-hoc choice for array subscripting. My
point is that an "incomplete range" is, by definition, not a range; it is a
bound, which is an Index. We don't need a wrapper around the Index in the
form of IncompleteRange because it provides little if any semantic
value: the bounds of such a range are not knowable divorced from the
sequence being indexed, and so the upperboundedness or lowerboundedness of
IncompleteRange is not a useful piece of information without the
sequence. Thus, it is more appropriately a label when you invoke a
subscript on a sequence, not a property of the argument, which is a bound.

* Removing one of the bounds _is_ a natural impulse, but the natural result
one would expect to get is a half-unbounded range (i.e. infinite range),
not an "incomplete range." As I argued above, one must distinguish these
semantics. You are proposing something with the semantics of an "incomplete
range."

* Your example of switch statements is (a) mostly sugar (the exhaustiveness
checking made possible could also be proposed independent of syntax
changes); and (b) arguably an infinite range in semantics. I'll have to
think more on this, though.

Bottom line, I can support an infinite range, but I don't think I like the
idea of an "incomplete range." An "incomplete range" is an Index used as a
lower or upper bound and should be expressed as a plain Index, because it
is not a range. An infinite range, OTOH, is a range. However, the array
subscripting use case is not a use case for infinite ranges, as I've argued
above.
On Wed, Feb 1, 2017 at 08:29 Matthew Johnson  wrote:

>
> > On Feb 1, 2017, at 6:58 AM, Brent Royal-Gordon via swift-evolution <
> swift-evolution@swift.org> wrote:
> >
> >> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu via swift-evolution <
> swift-evolution@swift.org> wrote:
> >>
> >> Therefore I'd conclude that `arr[upTo: i]` is the most consistent
> spelling. It also yields the sensible result that `arr[from: i][upTo: j] ==
> arr[upTo: j][from: i] == arr[i.. >
> > There's a lot I dislike about `subscript(upTo/through/from:)`:
> >
> > 1. We have not previously been very satisfied with how understandable
> these labels are—for instance, we fiddled around with them a lot when we
> were looking at `stride(from:to/through:by:)` in Swift 3, and eventually
> settled on the originals because we couldn't find anything better. I don't
> think entrenching them further makes very much sense.
> >
> > 2. The fact that you *can* write `arr[from: i][upTo: j]`, and that this
> is equivalent to both `arr[upTo: j][from: i]` and `arr[i.. weird. We aren't typically in the habit of providing redundant APIs like
> this.
> >
> > 3. Neither Stdlib nor the Apple frameworks currently contain *any*
> labeled subscripts, so this design would be unprecedented in the core
> language.
> >
> > 4. After a new programmer learns about subscripting with two-sided
> ranges, removing one of the bounds is a straightforward extension of what
> they already know. The argument label solution is more ad-hoc.
> >
> > 5. The argument label solution solves the immediate problem, but doesn't
> give us anything else.
> >
> > To understand what I mean by #5, consider the implementation. The plan
> is to introduce a 

Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Matthew Johnson via swift-evolution

> On Feb 1, 2017, at 6:58 AM, Brent Royal-Gordon via swift-evolution 
>  wrote:
> 
>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu via swift-evolution 
>>  wrote:
>> 
>> Therefore I'd conclude that `arr[upTo: i]` is the most consistent spelling. 
>> It also yields the sensible result that `arr[from: i][upTo: j] == arr[upTo: 
>> j][from: i] == arr[i.. 
> There's a lot I dislike about `subscript(upTo/through/from:)`:
> 
> 1. We have not previously been very satisfied with how understandable these 
> labels are—for instance, we fiddled around with them a lot when we were 
> looking at `stride(from:to/through:by:)` in Swift 3, and eventually settled 
> on the originals because we couldn't find anything better. I don't think 
> entrenching them further makes very much sense.
> 
> 2. The fact that you *can* write `arr[from: i][upTo: j]`, and that this is 
> equivalent to both `arr[upTo: j][from: i]` and `arr[i.. weird. We aren't typically in the habit of providing redundant APIs like this.
> 
> 3. Neither Stdlib nor the Apple frameworks currently contain *any* labeled 
> subscripts, so this design would be unprecedented in the core language.
> 
> 4. After a new programmer learns about subscripting with two-sided ranges, 
> removing one of the bounds is a straightforward extension of what they 
> already know. The argument label solution is more ad-hoc.
> 
> 5. The argument label solution solves the immediate problem, but doesn't give 
> us anything else.
> 
> To understand what I mean by #5, consider the implementation. The plan is to 
> introduce a `RangeExpression` protocol:
> 
>   protocol RangeExpression {
>   associatedtype Bound: Comparable
>   func relative Bound -> Range
>   }
> 
> And then reduce the many manually-generated variants of `subscript(_: 
> Range)` in `Collection` to just two:
> 
>   protocol Collection {
>   ...
>   subscript(bounds: Range) -> SubSequence { get }
>   ...
>   }
>   
>   extension Collection {
>   ...
>   subscript(bounds: Bounds) where 
> Bounds.Bound == Index -> SubSequence {
>   return self[bounds.relative(to: self)]
>   }
>   ...
>   }
> 
> This design would automatically, source-compatibly, handle several different 
> existing types you can slice with:
> 
> * ClosedRange
> * CountableRange
> * CountableClosedRange
> 
> Plus the new types associated with incomplete ranges:
> 
> * IncompleteRange
> * IncompleteClosedRange
> 
> Plus anything else we, or users, might want to add. For instance, I have a 
> prototype built on `RangeExpression` which lets you write things like:
> 
>   myString[.startIndex + 1 ..< .endIndex - 1]
> 
> This strikes me as a pretty cool thing that some people might want. 
> 
> Similarly, IncompleteRange and IncompleteClosedRange can most likely be put 
> to other uses. They could easily fill a gap in `switch` statements, which 
> don't have a good way to express open-ended comparisons except with a `where` 
> clause. As some have mentioned, when applied to a `Strideable` type they 
> *could* be treated as infinite sequences, although it's not clear if we 
> really want to do that. And, you know, sometimes you really *do* have a case 
> where one or both bounds of a range may be missing in some cases; incomplete 
> ranges are a built-in, batteries-included way to model that.
> 
> To put it simply, slicing with incomplete ranges gives us several valuable 
> tools we can apply to other problems. Labeled subscripts, on the other hand, 
> are just another weird little thing that you have to memorize, and probably 
> won’t.

+1 in general.  But I’m still curious how postfix `…` would impact our options 
for variadic generics and tuple unpacking in the future.  

> 
> -- 
> Brent Royal-Gordon
> Architechies
> 
> ___
> swift-evolution mailing list
> swift-evolution@swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Xiaodi Wu via swift-evolution
I entirely agree with you on the desired behavior of `zip(...)`.

However, if you insist on 0... being notionally an infinite range, then you
would have to insist on `for i in 0...` also trapping. Which is not a big
deal, IMO, but will surely make the anti-trapping crowd upset.

The bigger issue is that either you must have a lenient/clamping subscript
for `arr[0...]` or it too must trap, which is not desired. However, if
`arr[0...]` is clamping, then `[1, 2, 3][100...]` would not trap and
instead give you `[]`.

If 0... is regarded as an incomplete range, your example of `zip(...)`
could still trap as desired. It would trap on the notional attempt to
assign someArray.count to IncompleteRange.inferredUpperBound if count
exceeds T.max. With such semantics for 0..., [1, 2, 3][0...] would behave
as expected without the need for leniency, but [1, 2, 3][100...] would trap
as I assume you'd expect. However, it would make no sense to write `for i
in 0...`.

On Tue, Jan 31, 2017 at 21:39 Dave Abrahams  wrote:

>
> on Tue Jan 31 2017, Xiaodi Wu  wrote:
>
> > But that's not getting to the biggest hitch with your proposal. If
> > subscript were lenient, then `arr[lenient: 42...]` would also have to
> give
> > you a result even if `arr.count == 21`.
> >
> > This is not at all what Dave Abrahams was proposing, though (unless I
> > totally misunderstand). He truly doesn't want an infinite range. He wants
> > to use a terser notation for saying: I want x to be the lower bound of a
> > range for which I don't yet know (or haven't bothered to find out) the
> > finite upper bound. It would be plainly clear, if spelled as `arr[from:
> > 42]`, that if `arr.count < 43` then this expression will trap, but if
> > `arr.count >= 43` then this expression will give you the rest of the
> > elements.
>
> I think you do misunderstand.  Notionally, 0... is an infinite range.
> The basic programming model for numbers in swift is (to a first
> approximation), program as if there's no overflow, and we'll catch you
> by trapping if your assumption is wrong.  It doesn't make sense for the
> semantics of 0... to depend on the deduced type of 0 or the
> representable range of Int
>
> for example,
>
> for x in zip(n..., someArray) {
>
> }
>
> How many iterations should this give you?  If it doesn't process all of
> someArray, I want a trap.
>
> --
> -Dave
>
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-02-01 Thread Brent Royal-Gordon via swift-evolution
> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu via swift-evolution 
>  wrote:
> 
> Therefore I'd conclude that `arr[upTo: i]` is the most consistent spelling. 
> It also yields the sensible result that `arr[from: i][upTo: j] == arr[upTo: 
> j][from: i] == arr[i.. SubSequence { get }
...
}

extension Collection {
...
subscript(bounds: Bounds) where 
Bounds.Bound == Index -> SubSequence {
return self[bounds.relative(to: self)]
}
...
}

This design would automatically, source-compatibly, handle several different 
existing types you can slice with:

* ClosedRange
* CountableRange
* CountableClosedRange

Plus the new types associated with incomplete ranges:

* IncompleteRange
* IncompleteClosedRange

Plus anything else we, or users, might want to add. For instance, I have a 
prototype built on `RangeExpression` which lets you write things like:

myString[.startIndex + 1 ..< .endIndex - 1]

This strikes me as a pretty cool thing that some people might want. 

Similarly, IncompleteRange and IncompleteClosedRange can most likely be put to 
other uses. They could easily fill a gap in `switch` statements, which don't 
have a good way to express open-ended comparisons except with a `where` clause. 
As some have mentioned, when applied to a `Strideable` type they *could* be 
treated as infinite sequences, although it's not clear if we really want to do 
that. And, you know, sometimes you really *do* have a case where one or both 
bounds of a range may be missing in some cases; incomplete ranges are a 
built-in, batteries-included way to model that.

To put it simply, slicing with incomplete ranges gives us several valuable 
tools we can apply to other problems. Labeled subscripts, on the other hand, 
are just another weird little thing that you have to memorize, and probably 
won't.

-- 
Brent Royal-Gordon
Architechies

___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Dave Abrahams via swift-evolution

on Tue Jan 31 2017, Xiaodi Wu  wrote:

> But that's not getting to the biggest hitch with your proposal. If
> subscript were lenient, then `arr[lenient: 42...]` would also have to give
> you a result even if `arr.count == 21`.
>
> This is not at all what Dave Abrahams was proposing, though (unless I
> totally misunderstand). He truly doesn't want an infinite range. He wants
> to use a terser notation for saying: I want x to be the lower bound of a
> range for which I don't yet know (or haven't bothered to find out) the
> finite upper bound. It would be plainly clear, if spelled as `arr[from:
> 42]`, that if `arr.count < 43` then this expression will trap, but if
> `arr.count >= 43` then this expression will give you the rest of the
> elements.

I think you do misunderstand.  Notionally, 0... is an infinite range.
The basic programming model for numbers in swift is (to a first
approximation), program as if there's no overflow, and we'll catch you
by trapping if your assumption is wrong.  It doesn't make sense for the
semantics of 0... to depend on the deduced type of 0 or the
representable range of Int

for example, 

for x in zip(n..., someArray) {

}

How many iterations should this give you?  If it doesn't process all of
someArray, I want a trap.

-- 
-Dave
___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Matthew Johnson via swift-evolution


Sent from my iPad

> On Jan 31, 2017, at 7:28 PM, Xiaodi Wu  wrote:
> 
>> On Tue, Jan 31, 2017 at 7:08 PM, Matthew Johnson  
>> wrote:
>> 
>>> On Jan 31, 2017, at 6:54 PM, Xiaodi Wu  wrote:
>>> 
 On Tue, Jan 31, 2017 at 6:40 PM, Matthew Johnson  
 wrote:
 
> On Jan 31, 2017, at 6:15 PM, Xiaodi Wu  wrote:
> 
>> On Tue, Jan 31, 2017 at 6:09 PM, Matthew Johnson 
>>  wrote:
>> 
>>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>>>  wrote:
>>> 
 On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris  
 wrote:
 
> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu  wrote:
> 
>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
>>  wrote:
>> 
>>> On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
>>>  wrote:
>>> 
>>> I think that is perfectly reasonable, but then it seems weird to be 
>>> able to iterate over it (with no upper bound) independently of a 
>>> collection). It would surprise me if
>>> ```
>>> for x in arr[arr.startIndex…] { print(x) }
>>> ```
>>> yielded different results than
>>> ```
>>> for i in arr.startIndex… { print(arr[i]) } // CRASH
>>> ```
>>> which it does under this model.
>> 
>> (I think this how it works... semantically, anyway) Since the upper 
>> bound isn't specified, it's inferred from the context.
>> 
>> In the first case, the context is as an index into an array, so the 
>> upper bound is inferred to be the last valid index.
>> 
>> In the second case, there is no context, so it goes to Int.max. 
>> Then, after the "wrong" context has been established, you try to 
>> index an array with numbers from the too-large range.
>> 
>> Semantically speaking, they're pretty different operations. Why is 
>> it surprising that they have different results?
> 
> I must say, I was originally rather fond of `0...` as a spelling, but 
> IMO, Jaden and others have pointed out a real semantic issue.
> 
> A range is, to put it simply, the "stuff" between two end points. A 
> "range with no upper bound" _has to be_ one that continues forever. 
> The upper bound _must_ be infinity.
 
 Depends… Swift doesn’t allow partial initializations, and neither the 
 `.endIndex` nor the `.upperBound` properties of a `Range` are 
 optional. From a strictly syntactic PoV, a "Range without an 
 upperBound” can’t exist without getting into undefined behavior 
 territory.
 
 Plus, mathematically speaking, an infinite range would be written "[x, 
 ∞)", with an open upper bracket. If you write “[x, ∞]”, with a closed 
 upper bracket, that’s kind of a meaningless statement. I would argue 
 that if we’re going to represent that “infinite” range, the closest 
 Swift spelling would be “x..<“. That leaves the mathematically 
 undefined notation of “[x, ∞]”, spelled as "x…” in Swift, free to let 
 us have “x…” or “…x” (which by similar reasoning can’t mean "(∞, x]”) 
 return one of these:
 enum IncompleteRange {
 case upperValue(T)
 case lowerValue(T)
 }
 which we could then pass to the subscript function of a collection to 
 create the actual Range like this:
 extension Collection {
 subscript(_ ir: IncompleteRange) -> SubSequence {
 switch ir {
 case .lowerValue(let lower): return self[lower ..< 
 self.endIndex]
 case .upperValue(let upper): return self[self.startIndex ..< 
 upper]
 }
 }
 }
>>> 
>>> I understand that you can do this from a technical perspective. But I'm 
>>> arguing it's devoid of semantics.  That is, it's a spelling to dress up 
>>> a number.
>> 
>> It’s not any more devoid of semantics than a partially applied function.
> 
> Yes, but this here is not a partially applied type.
> 
> Nor does it square with your proposal that you should be able to use `for 
> i in 0...` to mean something different from `array[0...]`. We don't have 
> partially applied functions doubling as function calls with default 
> arguments.
 
 I’m not trying to say it’s *exactly* like a partially applied function.
>>> 
>>> I'm not saying you're arguing that point. I'm saying that there is a 
>>> semantic distinction between (1) a range 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Matthew Johnson via swift-evolution

> On Jan 31, 2017, at 6:54 PM, Xiaodi Wu  wrote:
> 
> On Tue, Jan 31, 2017 at 6:40 PM, Matthew Johnson  > wrote:
> 
>> On Jan 31, 2017, at 6:15 PM, Xiaodi Wu > > wrote:
>> 
>> On Tue, Jan 31, 2017 at 6:09 PM, Matthew Johnson > > wrote:
>> 
>>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>>> > wrote:
>>> 
>>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris >> > wrote:
>>> 
 On Jan 31, 2017, at 2:04 PM, Xiaodi Wu > wrote:
 
 On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
 > wrote:
 
 On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
 > wrote:
 
> I think that is perfectly reasonable, but then it seems weird to be able 
> to iterate over it (with no upper bound) independently of a collection). 
> It would surprise me if
> ```
> for x in arr[arr.startIndex…] { print(x) }
> ```
> yielded different results than
> ```
> for i in arr.startIndex… { print(arr[i]) } // CRASH
> ```
> which it does under this model.
 
 (I think this how it works... semantically, anyway) Since the upper bound 
 isn't specified, it's inferred from the context.
 
 In the first case, the context is as an index into an array, so the upper 
 bound is inferred to be the last valid index.
 
 In the second case, there is no context, so it goes to Int.max. Then, 
 after the "wrong" context has been established, you try to index an array 
 with numbers from the too-large range.
 
 Semantically speaking, they're pretty different operations. Why is it 
 surprising that they have different results?
 
 I must say, I was originally rather fond of `0...` as a spelling, but IMO, 
 Jaden and others have pointed out a real semantic issue.
 
 A range is, to put it simply, the "stuff" between two end points. A "range 
 with no upper bound" _has to be_ one that continues forever. The upper 
 bound _must_ be infinity.
>>> 
>>> Depends… Swift doesn’t allow partial initializations, and neither the 
>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional. 
>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist 
>>> without getting into undefined behavior territory.
>>> 
>>> Plus, mathematically speaking, an infinite range would be written "[x, ∞)", 
>>> with an open upper bracket. If you write “[x, ∞]”, with a closed upper 
>>> bracket, that’s kind of a meaningless statement. I would argue that if 
>>> we’re going to represent that “infinite” range, the closest Swift spelling 
>>> would be “x..<“. That leaves the mathematically undefined notation of “[x, 
>>> ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by 
>>> similar reasoning can’t mean "(∞, x]”) return one of these:
>>> enum IncompleteRange {
>>> case upperValue(T)
>>> case lowerValue(T)
>>> }
>>> which we could then pass to the subscript function of a collection to 
>>> create the actual Range like this:
>>> extension Collection {
>>> subscript(_ ir: IncompleteRange) -> SubSequence {
>>> switch ir {
>>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>>> case .upperValue(let upper): return self[self.startIndex ..< upper]
>>> }
>>> }
>>> }
>>> 
>>> I understand that you can do this from a technical perspective. But I'm 
>>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a 
>>> number.
>> 
>> It’s not any more devoid of semantics than a partially applied function.
>> 
>> Yes, but this here is not a partially applied type.
>> 
>> Nor does it square with your proposal that you should be able to use `for i 
>> in 0...` to mean something different from `array[0...]`. We don't have 
>> partially applied functions doubling as function calls with default 
>> arguments.
> 
> I’m not trying to say it’s *exactly* like a partially applied function.
> 
> I'm not saying you're arguing that point. I'm saying that there is a semantic 
> distinction between (1) a range with two bounds where you've only specified 
> the one, and (2) a range with one bound. There must be an answer to the 
> question: what is the nature of the upper bound of `0...`? Either it exists 
> but is not yet known, or it is known that it does not exist (or, it is not 
> yet known whether or not it exists). But these are not the same thing!
> 
>> It is a number or index with added semantics that it provides a lower (or 
>> upper) 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Matthew Johnson via swift-evolution

> On Jan 31, 2017, at 6:15 PM, Jaden Geller  wrote:
> 
> 
>> On Jan 31, 2017, at 4:09 PM, Matthew Johnson via swift-evolution 
>> > wrote:
>> 
>> 
>>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>>> > wrote:
>>> 
>>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris >> > wrote:
>>> 
 On Jan 31, 2017, at 2:04 PM, Xiaodi Wu > wrote:
 
 On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
 > wrote:
 
 On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
 > wrote:
 
> I think that is perfectly reasonable, but then it seems weird to be able 
> to iterate over it (with no upper bound) independently of a collection). 
> It would surprise me if
> ```
> for x in arr[arr.startIndex…] { print(x) }
> ```
> yielded different results than
> ```
> for i in arr.startIndex… { print(arr[i]) } // CRASH
> ```
> which it does under this model.
 
 (I think this how it works... semantically, anyway) Since the upper bound 
 isn't specified, it's inferred from the context.
 
 In the first case, the context is as an index into an array, so the upper 
 bound is inferred to be the last valid index.
 
 In the second case, there is no context, so it goes to Int.max. Then, 
 after the "wrong" context has been established, you try to index an array 
 with numbers from the too-large range.
 
 Semantically speaking, they're pretty different operations. Why is it 
 surprising that they have different results?
 
 I must say, I was originally rather fond of `0...` as a spelling, but IMO, 
 Jaden and others have pointed out a real semantic issue.
 
 A range is, to put it simply, the "stuff" between two end points. A "range 
 with no upper bound" _has to be_ one that continues forever. The upper 
 bound _must_ be infinity.
>>> 
>>> Depends… Swift doesn’t allow partial initializations, and neither the 
>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional. 
>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist 
>>> without getting into undefined behavior territory.
>>> 
>>> Plus, mathematically speaking, an infinite range would be written "[x, ∞)", 
>>> with an open upper bracket. If you write “[x, ∞]”, with a closed upper 
>>> bracket, that’s kind of a meaningless statement. I would argue that if 
>>> we’re going to represent that “infinite” range, the closest Swift spelling 
>>> would be “x..<“. That leaves the mathematically undefined notation of “[x, 
>>> ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by 
>>> similar reasoning can’t mean "(∞, x]”) return one of these:
>>> enum IncompleteRange {
>>> case upperValue(T)
>>> case lowerValue(T)
>>> }
>>> which we could then pass to the subscript function of a collection to 
>>> create the actual Range like this:
>>> extension Collection {
>>> subscript(_ ir: IncompleteRange) -> SubSequence {
>>> switch ir {
>>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>>> case .upperValue(let upper): return self[self.startIndex ..< upper]
>>> }
>>> }
>>> }
>>> 
>>> I understand that you can do this from a technical perspective. But I'm 
>>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a 
>>> number.
>> 
>> It’s not any more devoid of semantics than a partially applied function.  It 
>> is a number or index with added semantics that it provides a lower (or 
>> upper) bound on the possible value specified by its type.
> 
> If we treat it as such, we shouldn’t allow users to iterate over it directly:
> ```
> for x in 0… { // <- doesn’t make sense; only partially specified
>   print(“hi”)
> }
> ```
> 
> We __could__ introduce 2 types, `IncompleteRange` and `InfiniteRange`, 
> providing an overload that constructs each. It would never be ambiguous 
> because `InfiniteRange ` would be the only `Sequence` and `IncompleteRange` 
> would be the only one of these two that is accepted as a collections 
> subscript.
> 
> This *isn’t* that crazy either. There’s precedent for this too. The `..<` 
> operator used to create both ranges and intervals (though it seems those type 
> have started to merge).
> 
> ¯\_(ツ)_/¯

This is what I was getting at, but hadn’t thought the details all the way 
through to realize that we really would need two distinct types here.  I 
apologize for being a little bit sloppy in both my reasoning and my 
terminology.  Thanks for clearing up the details!  :)

> 
>> 
>>> 
>>> 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Xiaodi Wu via swift-evolution
On Tue, Jan 31, 2017 at 6:40 PM, Matthew Johnson 
wrote:

>
> On Jan 31, 2017, at 6:15 PM, Xiaodi Wu  wrote:
>
> On Tue, Jan 31, 2017 at 6:09 PM, Matthew Johnson 
> wrote:
>
>>
>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution <
>> swift-evolution@swift.org> wrote:
>>
>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris 
>> wrote:
>>
>>>
>>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu  wrote:
>>>
>>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution <
>>> swift-evolution@swift.org> wrote:
>>>

 On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution <
 swift-evolution@swift.org> wrote:

 I think that is perfectly reasonable, but then it seems weird to be
 able to iterate over it (with no upper bound) independently of a
 collection). It would surprise me if
 ```
 for x in arr[arr.startIndex…] { print(x) }
 ```
 yielded different results than
 ```
 for i in arr.startIndex… { print(arr[i]) } // CRASH
 ```
 which it does under this model.


 (I *think* this how it works... semantically, anyway) Since the upper
 bound isn't specified, it's inferred from the context.

 In the first case, the context is as an index into an array, so the
 upper bound is inferred to be the last valid index.

 In the second case, there is no context, so it goes to Int.max. Then,
 *after* the "wrong" context has been established, you try to index an
 array with numbers from the too-large range.

 Semantically speaking, they're pretty different operations. Why is it
 surprising that they have different results?

>>>
>>> I must say, I was originally rather fond of `0...` as a spelling, but
>>> IMO, Jaden and others have pointed out a real semantic issue.
>>>
>>> A range is, to put it simply, the "stuff" between two end points. A
>>> "range with no upper bound" _has to be_ one that continues forever. The
>>> upper bound _must_ be infinity.
>>>
>>>
>>> Depends… Swift doesn’t allow partial initializations, and neither the
>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional.
>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist
>>> without getting into undefined behavior territory.
>>>
>>> Plus, mathematically speaking, an infinite range would be written "[x,
>>> ∞)", with an open upper bracket. If you write “[x, ∞]”, with a *closed*
>>> upper bracket, that’s kind of a meaningless statement. I would argue that
>>> if we’re going to represent that “infinite” range, the closest Swift
>>> spelling would be “x..<“. That leaves the mathematically undefined notation
>>> of “[x, ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x”
>>> (which by similar reasoning can’t mean "(∞, x]”) return one of these:
>>>
>>> enum IncompleteRange {
>>> case upperValue(T)
>>> case lowerValue(T)
>>> }
>>>
>>> which we could then pass to the subscript function of a collection to
>>> create the actual Range like this:
>>>
>>> extension Collection {
>>> subscript(_ ir: IncompleteRange) -> SubSequence {
>>> switch ir {
>>> case .lowerValue(let lower): return self[lower ..< self.endIndex
>>> ]
>>> case .upperValue(let upper): return self[self.startIndex ..<
>>> upper]
>>> }
>>> }
>>> }
>>>
>>>
>> I understand that you can do this from a technical perspective. But I'm
>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a
>> number.
>>
>>
>> It’s not any more devoid of semantics than a partially applied function.
>>
>
> Yes, but this here is not a partially applied type.
>
> Nor does it square with your proposal that you should be able to use `for
> i in 0...` to mean something different from `array[0...]`. We don't have
> partially applied functions doubling as function calls with default
> arguments.
>
>
> I’m not trying to say it’s *exactly* like a partially applied function.
>

I'm not saying you're arguing that point. I'm saying that there is a
semantic distinction between (1) a range with two bounds where you've only
specified the one, and (2) a range with one bound. There must be an answer
to the question: what is the nature of the upper bound of `0...`? Either it
exists but is not yet known, or it is known that it does not exist (or, it
is not yet known whether or not it exists). But these are not the same
thing!

It is a number or index with added semantics that it provides a lower (or
>> upper) bound on the possible value specified by its type.
>>
>>
>> What is such an `IncompleteRange` other than a value of type T? It's
>> not an upper bound or lower bound of anything until it's used to index a
>> collection. Why have a new type (IncompleteRange), a new set of
>> operators (prefix and postfix range operators), and these muddied semantics
>> for something that can be written 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Matthew Johnson via swift-evolution

> On Jan 31, 2017, at 6:15 PM, Xiaodi Wu  wrote:
> 
> On Tue, Jan 31, 2017 at 6:09 PM, Matthew Johnson  > wrote:
> 
>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>> > wrote:
>> 
>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris > > wrote:
>> 
>>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu >> > wrote:
>>> 
>>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
>>> > wrote:
>>> 
>>> On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
>>> > wrote:
>>> 
 I think that is perfectly reasonable, but then it seems weird to be able 
 to iterate over it (with no upper bound) independently of a collection). 
 It would surprise me if
 ```
 for x in arr[arr.startIndex…] { print(x) }
 ```
 yielded different results than
 ```
 for i in arr.startIndex… { print(arr[i]) } // CRASH
 ```
 which it does under this model.
>>> 
>>> (I think this how it works... semantically, anyway) Since the upper bound 
>>> isn't specified, it's inferred from the context.
>>> 
>>> In the first case, the context is as an index into an array, so the upper 
>>> bound is inferred to be the last valid index.
>>> 
>>> In the second case, there is no context, so it goes to Int.max. Then, after 
>>> the "wrong" context has been established, you try to index an array with 
>>> numbers from the too-large range.
>>> 
>>> Semantically speaking, they're pretty different operations. Why is it 
>>> surprising that they have different results?
>>> 
>>> I must say, I was originally rather fond of `0...` as a spelling, but IMO, 
>>> Jaden and others have pointed out a real semantic issue.
>>> 
>>> A range is, to put it simply, the "stuff" between two end points. A "range 
>>> with no upper bound" _has to be_ one that continues forever. The upper 
>>> bound _must_ be infinity.
>> 
>> Depends… Swift doesn’t allow partial initializations, and neither the 
>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional. From 
>> a strictly syntactic PoV, a "Range without an upperBound” can’t exist 
>> without getting into undefined behavior territory.
>> 
>> Plus, mathematically speaking, an infinite range would be written "[x, ∞)", 
>> with an open upper bracket. If you write “[x, ∞]”, with a closed upper 
>> bracket, that’s kind of a meaningless statement. I would argue that if we’re 
>> going to represent that “infinite” range, the closest Swift spelling would 
>> be “x..<“. That leaves the mathematically undefined notation of “[x, ∞]”, 
>> spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by similar 
>> reasoning can’t mean "(∞, x]”) return one of these:
>> enum IncompleteRange {
>> case upperValue(T)
>> case lowerValue(T)
>> }
>> which we could then pass to the subscript function of a collection to create 
>> the actual Range like this:
>> extension Collection {
>> subscript(_ ir: IncompleteRange) -> SubSequence {
>> switch ir {
>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>> case .upperValue(let upper): return self[self.startIndex ..< upper]
>> }
>> }
>> }
>> 
>> I understand that you can do this from a technical perspective. But I'm 
>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a 
>> number.
> 
> It’s not any more devoid of semantics than a partially applied function.
> 
> Yes, but this here is not a partially applied type.
> 
> Nor does it square with your proposal that you should be able to use `for i 
> in 0...` to mean something different from `array[0...]`. We don't have 
> partially applied functions doubling as function calls with default arguments.

I’m not trying to say it’s *exactly* like a partially applied function.

>  
> It is a number or index with added semantics that it provides a lower (or 
> upper) bound on the possible value specified by its type.
> 
>> 
>> What is such an `IncompleteRange` other than a value of type T? It's not 
>> an upper bound or lower bound of anything until it's used to index a 
>> collection. Why have a new type (IncompleteRange), a new set of operators 
>> (prefix and postfix range operators), and these muddied semantics for 
>> something that can be written `subscript(upTo upperBound: Index) -> 
>> SubSequence { ... }`? _That_ has unmistakable semantics and requires no new 
>> syntax.
> 
> Arguing that it adds too much complexity relative to the value it provides is 
> reasonable.  The value in this use case is mostly syntactic sugar so it’s 
> relatively easy to make the case that it doesn’t cary its weight here.
> 
> The value in Ben’s use case is 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Xiaodi Wu via swift-evolution
On Tue, Jan 31, 2017 at 6:27 PM, Jaden Geller 
wrote:

>
> On Jan 31, 2017, at 4:20 PM, Xiaodi Wu  wrote:
>
> On Tue, Jan 31, 2017 at 6:15 PM, Jaden Geller 
> wrote:
>
>>
>> On Jan 31, 2017, at 4:09 PM, Matthew Johnson via swift-evolution <
>> swift-evolution@swift.org> wrote:
>>
>>
>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution <
>> swift-evolution@swift.org> wrote:
>>
>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris 
>> wrote:
>>
>>>
>>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu  wrote:
>>>
>>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution <
>>> swift-evolution@swift.org> wrote:
>>>

 On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution <
 swift-evolution@swift.org> wrote:

 I think that is perfectly reasonable, but then it seems weird to be
 able to iterate over it (with no upper bound) independently of a
 collection). It would surprise me if
 ```
 for x in arr[arr.startIndex…] { print(x) }
 ```
 yielded different results than
 ```
 for i in arr.startIndex… { print(arr[i]) } // CRASH
 ```
 which it does under this model.


 (I *think* this how it works... semantically, anyway) Since the upper
 bound isn't specified, it's inferred from the context.

 In the first case, the context is as an index into an array, so the
 upper bound is inferred to be the last valid index.

 In the second case, there is no context, so it goes to Int.max. Then,
 *after* the "wrong" context has been established, you try to index an
 array with numbers from the too-large range.

 Semantically speaking, they're pretty different operations. Why is it
 surprising that they have different results?

>>>
>>> I must say, I was originally rather fond of `0...` as a spelling, but
>>> IMO, Jaden and others have pointed out a real semantic issue.
>>>
>>> A range is, to put it simply, the "stuff" between two end points. A
>>> "range with no upper bound" _has to be_ one that continues forever. The
>>> upper bound _must_ be infinity.
>>>
>>>
>>> Depends… Swift doesn’t allow partial initializations, and neither the
>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional.
>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist
>>> without getting into undefined behavior territory.
>>>
>>> Plus, mathematically speaking, an infinite range would be written "[x,
>>> ∞)", with an open upper bracket. If you write “[x, ∞]”, with a *closed* 
>>> upper
>>> bracket, that’s kind of a meaningless statement. I would argue that if
>>> we’re going to represent that “infinite” range, the closest Swift spelling
>>> would be “x..<“. That leaves the mathematically undefined notation of “[x,
>>> ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by
>>> similar reasoning can’t mean "(∞, x]”) return one of these:
>>>
>>> enum IncompleteRange {
>>> case upperValue(T)
>>> case lowerValue(T)
>>> }
>>>
>>> which we could then pass to the subscript function of a collection to
>>> create the actual Range like this:
>>>
>>> extension Collection {
>>> subscript(_ ir: IncompleteRange) -> SubSequence {
>>> switch ir {
>>> case .lowerValue(let lower): return self[lower ..< self.endIndex
>>> ]
>>> case .upperValue(let upper): returnself[self.startIndex ..<
>>> upper]
>>> }
>>> }
>>> }
>>>
>>>
>> I understand that you can do this from a technical perspective. But I'm
>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a
>> number.
>>
>>
>> It’s not any more devoid of semantics than a partially applied function.
>> It is a number or index with added semantics that it provides a lower (or
>> upper) bound on the possible value specified by its type.
>>
>>
>> If we treat it as such, we shouldn’t allow users to iterate over it
>> directly:
>> ```
>> for x in 0… { // <- doesn’t make sense; only partially specified
>>   print(“hi”)
>> }
>> ```
>>
>> We __could__ introduce 2 types, `IncompleteRange` and `InfiniteRange`,
>> providing an overload that constructs each. It would never be ambiguous
>> because `InfiniteRange ` would be the only `Sequence` and `IncompleteRange`
>> would be the only one of these two that is accepted as a collections
>> subscript.
>>
>> This *isn’t* that crazy either. There’s precedent for this too. The `..<`
>> operator used to create both ranges and intervals (though it seems those
>> type have started to merge).
>>
>> ¯\_(ツ)_/¯
>>
>
>
> Mercifully, those types have completely merged AFAIK. IMO, the long-term
> aim should be to have ... and ..< produce only one kind of range.
>
>
> There are still 2 variants (`Range` and `CountableRange`), but I imagine
> conditional conformances will combine those entirely.
>

Per Dave, that's the goal :)

(I hope conditional 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Xiaodi Wu via swift-evolution
On Tue, Jan 31, 2017 at 5:58 PM, Matthew Johnson 
wrote:

>
> On Jan 31, 2017, at 5:20 PM, Xiaodi Wu  wrote:
>
> On Tue, Jan 31, 2017 at 5:04 PM, Matthew Johnson 
> wrote:
>
>>
>> I think it’s fair to say that we get to decide on the semantics of
>> postfix `…`.  “a range with no upper bound” is very reasonable, but
>> wouldn’t another reasonable semantics be “all the rest”, meaning that there
>> *is* an upper bound (the greatest possible value).
>>
>
> "All the rest" is by itself insufficient so far as semantics: all the rest
> _of what_? Supposing that our supplied lower bound is an integer, it must
> be all the rest of the integers. It cannot be all the rest of whatever,
> where whatever might be a collection that you try to subset with `0...`.
> (Recall that collections move indices, but indices know nothing about the
> collections.) It would be exceeding fuzzy for postfix `...` to mean "all
> the rest of whatever I want it to mean"--that, almost tautologically, has
> no semantics at all.
>
>
> Under the latter semantics, a `for i in 0…` loop would terminate after
>> reaching Int.max.  This is probably not what the user intended and would
>> still crash when used in David’s example, but it’s worth considering.
>>
>
> OK, I'm borderline fine with `0... == 0...Int.max`. It at least provides
> some semantics (i.e., we're saying `...` refers to all the rest of the
> values representable by the type used for the lower bound) [**]. But
> Jaden's point still stands, since it would only be consistent if `for i in
> arr[0...]` then traps after `arr.count` just like `for i in
> arr[0...Int.max]` would do. Otherwise, we really are fudging the semantics.
>
>
> If we really want to be honest about the information a value produced
> using postfix `…` carries, it is a partial range with only the lower bound
> specified.  This allows us to assign meaning to that partial range using
> additional context:
>
> * When it is possible to increment Bound directly could be interpreted as
> an (near?) infinite sequence that either terminates or traps when it
> reaches an unrepresentable value.
> * When Bound is an index and the partial range is used as a subscript
> argument it can be interpreted to mean “to the end of the collection”.
>

These are two very different semantics. One says, `i...` is a range with no
upper bound; the other says, `i...` is a lower bound of something. The
objection is twofold. In the first place, those shouldn't be spelled the
same way. In the second place, the "lower bound of something" is already
adequately accommodated by using, well, the actual lower bound. Well, also,
the "range with no upper bound" isn't very useful in practice, especially
if the only compelling use case is one for replacing an existing API that,
well, there is no consensus yet to replace.

This still leaves us with an out of bounds crash in David’s example that
> iterates over a partial range.  This is an artifact of `Array` using Int as
> it’s `Index` rather than an opaque type that does not allow users to
> increment it directly rather than using a collection.
>
> Is the problem in David’s example really that different than the ability
> to directly index an array with any `Int` we want?  It’s not the kind of
> thing that developers would do frequently.  The first time they try it they
> will get a crash and will learn not to do it again.
>
> I’m not necessarily arguing one way or the other.  I’m simply pointing out
> that “partial range” is a perfectly reasonable semantics to consider.
>
>
> [**] It is not perfectly consistent semantically because, as was discussed
> in threads about our numeric protocols, our integer types are supposed to
> model all integers, not just the ones that happen to be representable. Our
> model is imperfect because not all integers fit into finite memory, but
> that's a modeling artifact and not intentional semantics. IIUC, it would be
> otherwise difficult to give a good accounting of, say, the semantics of
> addition if arithmetic overflow were an intentional part of the semantics
> and not an artifact.
>
>
> I haven’t followed all of the details of the numeric protocol
> discussions.  With this in mind, I agree with your proposed semantics of
> trapping after `Int.max` as it sounds more consistent with this intent.
>
>
> I’m not sure if you read Ben’s post regarding `enumerated` or not, but he
>> gave the example of `zip(0…, sequence)` as a more general replacement for
>> `enumerated`.  IMO, he makes a pretty strong case for this.
>>
>>
>>
>> - Dave Sweeris
>>>
>>> ___
>>> swift-evolution mailing list
>>> swift-evolution@swift.org
>>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>>
>>>
>> ___
>> swift-evolution mailing list
>> swift-evolution@swift.org
>> https://lists.swift.org/mailman/listinfo/swift-evolution
>>
>>
>>
>
>

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Xiaodi Wu via swift-evolution
On Tue, Jan 31, 2017 at 6:15 PM, Jaden Geller 
wrote:

>
> On Jan 31, 2017, at 4:09 PM, Matthew Johnson via swift-evolution <
> swift-evolution@swift.org> wrote:
>
>
> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution <
> swift-evolution@swift.org> wrote:
>
> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris 
> wrote:
>
>>
>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu  wrote:
>>
>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution <
>> swift-evolution@swift.org> wrote:
>>
>>>
>>> On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution <
>>> swift-evolution@swift.org> wrote:
>>>
>>> I think that is perfectly reasonable, but then it seems weird to be able
>>> to iterate over it (with no upper bound) independently of a collection). It
>>> would surprise me if
>>> ```
>>> for x in arr[arr.startIndex…] { print(x) }
>>> ```
>>> yielded different results than
>>> ```
>>> for i in arr.startIndex… { print(arr[i]) } // CRASH
>>> ```
>>> which it does under this model.
>>>
>>>
>>> (I *think* this how it works... semantically, anyway) Since the upper
>>> bound isn't specified, it's inferred from the context.
>>>
>>> In the first case, the context is as an index into an array, so the
>>> upper bound is inferred to be the last valid index.
>>>
>>> In the second case, there is no context, so it goes to Int.max. Then,
>>> *after* the "wrong" context has been established, you try to index an
>>> array with numbers from the too-large range.
>>>
>>> Semantically speaking, they're pretty different operations. Why is it
>>> surprising that they have different results?
>>>
>>
>> I must say, I was originally rather fond of `0...` as a spelling, but
>> IMO, Jaden and others have pointed out a real semantic issue.
>>
>> A range is, to put it simply, the "stuff" between two end points. A
>> "range with no upper bound" _has to be_ one that continues forever. The
>> upper bound _must_ be infinity.
>>
>>
>> Depends… Swift doesn’t allow partial initializations, and neither the
>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional.
>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist
>> without getting into undefined behavior territory.
>>
>> Plus, mathematically speaking, an infinite range would be written "[x,
>> ∞)", with an open upper bracket. If you write “[x, ∞]”, with a *closed*
>> upper bracket, that’s kind of a meaningless statement. I would argue that
>> if we’re going to represent that “infinite” range, the closest Swift
>> spelling would be “x..<“. That leaves the mathematically undefined notation
>> of “[x, ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x”
>> (which by similar reasoning can’t mean "(∞, x]”) return one of these:
>>
>> enum IncompleteRange {
>> case upperValue(T)
>> case lowerValue(T)
>> }
>>
>> which we could then pass to the subscript function of a collection to
>> create the actual Range like this:
>>
>> extension Collection {
>> subscript(_ ir: IncompleteRange) -> SubSequence {
>> switch ir {
>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>> case .upperValue(let upper): return self[self.startIndex ..<
>> upper]
>> }
>> }
>> }
>>
>>
> I understand that you can do this from a technical perspective. But I'm
> arguing it's devoid of semantics.  That is, it's a spelling to dress up a
> number.
>
>
> It’s not any more devoid of semantics than a partially applied function.
> It is a number or index with added semantics that it provides a lower (or
> upper) bound on the possible value specified by its type.
>
>
> If we treat it as such, we shouldn’t allow users to iterate over it
> directly:
> ```
> for x in 0… { // <- doesn’t make sense; only partially specified
>   print(“hi”)
> }
> ```
>
> We __could__ introduce 2 types, `IncompleteRange` and `InfiniteRange`,
> providing an overload that constructs each. It would never be ambiguous
> because `InfiniteRange ` would be the only `Sequence` and `IncompleteRange`
> would be the only one of these two that is accepted as a collections
> subscript.
>
> This *isn’t* that crazy either. There’s precedent for this too. The `..<`
> operator used to create both ranges and intervals (though it seems those
> type have started to merge).
>
> ¯\_(ツ)_/¯
>


Mercifully, those types have completely merged AFAIK. IMO, the long-term
aim should be to have ... and ..< produce only one kind of range.

What is such an `IncompleteRange` other than a value of type T? It's not
> an upper bound or lower bound of anything until it's used to index a
> collection. Why have a new type (IncompleteRange), a new set of
> operators (prefix and postfix range operators), and these muddied semantics
> for something that can be written `subscript(upTo upperBound: Index) ->
> SubSequence { ... }`? _That_ has unmistakable semantics and requires no new
> syntax.
>
>
> Arguing that it adds 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Jaden Geller via swift-evolution

> On Jan 31, 2017, at 4:20 PM, Xiaodi Wu  wrote:
> 
> On Tue, Jan 31, 2017 at 6:15 PM, Jaden Geller  > wrote:
> 
>> On Jan 31, 2017, at 4:09 PM, Matthew Johnson via swift-evolution 
>> > wrote:
>> 
>> 
>>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>>> > wrote:
>>> 
>>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris >> > wrote:
>>> 
 On Jan 31, 2017, at 2:04 PM, Xiaodi Wu > wrote:
 
 On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
 > wrote:
 
 On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
 > wrote:
 
> I think that is perfectly reasonable, but then it seems weird to be able 
> to iterate over it (with no upper bound) independently of a collection). 
> It would surprise me if
> ```
> for x in arr[arr.startIndex…] { print(x) }
> ```
> yielded different results than
> ```
> for i in arr.startIndex… { print(arr[i]) } // CRASH
> ```
> which it does under this model.
 
 (I think this how it works... semantically, anyway) Since the upper bound 
 isn't specified, it's inferred from the context.
 
 In the first case, the context is as an index into an array, so the upper 
 bound is inferred to be the last valid index.
 
 In the second case, there is no context, so it goes to Int.max. Then, 
 after the "wrong" context has been established, you try to index an array 
 with numbers from the too-large range.
 
 Semantically speaking, they're pretty different operations. Why is it 
 surprising that they have different results?
 
 I must say, I was originally rather fond of `0...` as a spelling, but IMO, 
 Jaden and others have pointed out a real semantic issue.
 
 A range is, to put it simply, the "stuff" between two end points. A "range 
 with no upper bound" _has to be_ one that continues forever. The upper 
 bound _must_ be infinity.
>>> 
>>> Depends… Swift doesn’t allow partial initializations, and neither the 
>>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional. 
>>> From a strictly syntactic PoV, a "Range without an upperBound” can’t exist 
>>> without getting into undefined behavior territory.
>>> 
>>> Plus, mathematically speaking, an infinite range would be written "[x, ∞)", 
>>> with an open upper bracket. If you write “[x, ∞]”, with a closed upper 
>>> bracket, that’s kind of a meaningless statement. I would argue that if 
>>> we’re going to represent that “infinite” range, the closest Swift spelling 
>>> would be “x..<“. That leaves the mathematically undefined notation of “[x, 
>>> ∞]”, spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by 
>>> similar reasoning can’t mean "(∞, x]”) return one of these:
>>> enum IncompleteRange {
>>> case upperValue(T)
>>> case lowerValue(T)
>>> }
>>> which we could then pass to the subscript function of a collection to 
>>> create the actual Range like this:
>>> extension Collection {
>>> subscript(_ ir: IncompleteRange) -> SubSequence {
>>> switch ir {
>>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>>> case .upperValue(let upper): returnself[self.startIndex ..< upper]
>>> }
>>> }
>>> }
>>> 
>>> I understand that you can do this from a technical perspective. But I'm 
>>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a 
>>> number.
>> 
>> It’s not any more devoid of semantics than a partially applied function.  It 
>> is a number or index with added semantics that it provides a lower (or 
>> upper) bound on the possible value specified by its type.
> 
> If we treat it as such, we shouldn’t allow users to iterate over it directly:
> ```
> for x in 0… { // <- doesn’t make sense; only partially specified
>   print(“hi”)
> }
> ```
> 
> We __could__ introduce 2 types, `IncompleteRange` and `InfiniteRange`, 
> providing an overload that constructs each. It would never be ambiguous 
> because `InfiniteRange ` would be the only `Sequence` and `IncompleteRange` 
> would be the only one of these two that is accepted as a collections 
> subscript.
> 
> This *isn’t* that crazy either. There’s precedent for this too. The `..<` 
> operator used to create both ranges and intervals (though it seems those type 
> have started to merge).
> 
> ¯\_(ツ)_/¯
> 
> 
> Mercifully, those types have completely merged AFAIK. IMO, the long-term aim 
> should be to have ... and ..< produce only one kind of range.

There are still 2 variants 

Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Jaden Geller via swift-evolution

> On Jan 31, 2017, at 4:15 PM, Xiaodi Wu via swift-evolution 
>  wrote:
> 
> We don't have partially applied functions doubling as function calls with 
> default arguments.

I think this best summarizes my problem with simultaneously treating `0…` as a 
partially specified range (waiting for another “argument” from the collection) 
and as an infinite range (defaulting that argument to Int.max).___
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


Re: [swift-evolution] Strings in Swift 4

2017-01-31 Thread Jaden Geller via swift-evolution

> On Jan 31, 2017, at 4:09 PM, Matthew Johnson via swift-evolution 
>  wrote:
> 
> 
>> On Jan 31, 2017, at 5:35 PM, Xiaodi Wu via swift-evolution 
>> > wrote:
>> 
>> On Tue, Jan 31, 2017 at 5:28 PM, David Sweeris > > wrote:
>> 
>>> On Jan 31, 2017, at 2:04 PM, Xiaodi Wu >> > wrote:
>>> 
>>> On Tue, Jan 31, 2017 at 3:36 PM, David Sweeris via swift-evolution 
>>> > wrote:
>>> 
>>> On Jan 31, 2017, at 11:32, Jaden Geller via swift-evolution 
>>> > wrote:
>>> 
 I think that is perfectly reasonable, but then it seems weird to be able 
 to iterate over it (with no upper bound) independently of a collection). 
 It would surprise me if
 ```
 for x in arr[arr.startIndex…] { print(x) }
 ```
 yielded different results than
 ```
 for i in arr.startIndex… { print(arr[i]) } // CRASH
 ```
 which it does under this model.
>>> 
>>> (I think this how it works... semantically, anyway) Since the upper bound 
>>> isn't specified, it's inferred from the context.
>>> 
>>> In the first case, the context is as an index into an array, so the upper 
>>> bound is inferred to be the last valid index.
>>> 
>>> In the second case, there is no context, so it goes to Int.max. Then, after 
>>> the "wrong" context has been established, you try to index an array with 
>>> numbers from the too-large range.
>>> 
>>> Semantically speaking, they're pretty different operations. Why is it 
>>> surprising that they have different results?
>>> 
>>> I must say, I was originally rather fond of `0...` as a spelling, but IMO, 
>>> Jaden and others have pointed out a real semantic issue.
>>> 
>>> A range is, to put it simply, the "stuff" between two end points. A "range 
>>> with no upper bound" _has to be_ one that continues forever. The upper 
>>> bound _must_ be infinity.
>> 
>> Depends… Swift doesn’t allow partial initializations, and neither the 
>> `.endIndex` nor the `.upperBound` properties of a `Range` are optional. From 
>> a strictly syntactic PoV, a "Range without an upperBound” can’t exist 
>> without getting into undefined behavior territory.
>> 
>> Plus, mathematically speaking, an infinite range would be written "[x, ∞)", 
>> with an open upper bracket. If you write “[x, ∞]”, with a closed upper 
>> bracket, that’s kind of a meaningless statement. I would argue that if we’re 
>> going to represent that “infinite” range, the closest Swift spelling would 
>> be “x..<“. That leaves the mathematically undefined notation of “[x, ∞]”, 
>> spelled as "x…” in Swift, free to let us have “x…” or “…x” (which by similar 
>> reasoning can’t mean "(∞, x]”) return one of these:
>> enum IncompleteRange {
>> case upperValue(T)
>> case lowerValue(T)
>> }
>> which we could then pass to the subscript function of a collection to create 
>> the actual Range like this:
>> extension Collection {
>> subscript(_ ir: IncompleteRange) -> SubSequence {
>> switch ir {
>> case .lowerValue(let lower): return self[lower ..< self.endIndex]
>> case .upperValue(let upper): return self[self.startIndex ..< upper]
>> }
>> }
>> }
>> 
>> I understand that you can do this from a technical perspective. But I'm 
>> arguing it's devoid of semantics.  That is, it's a spelling to dress up a 
>> number.
> 
> It’s not any more devoid of semantics than a partially applied function.  It 
> is a number or index with added semantics that it provides a lower (or upper) 
> bound on the possible value specified by its type.

If we treat it as such, we shouldn’t allow users to iterate over it directly:
```
for x in 0… { // <- doesn’t make sense; only partially specified
  print(“hi”)
}
```

We __could__ introduce 2 types, `IncompleteRange` and `InfiniteRange`, 
providing an overload that constructs each. It would never be ambiguous because 
`InfiniteRange ` would be the only `Sequence` and `IncompleteRange` would be 
the only one of these two that is accepted as a collections subscript.

This *isn’t* that crazy either. There’s precedent for this too. The `..<` 
operator used to create both ranges and intervals (though it seems those type 
have started to merge).

¯\_(ツ)_/¯

> 
>> 
>> What is such an `IncompleteRange` other than a value of type T? It's not 
>> an upper bound or lower bound of anything until it's used to index a 
>> collection. Why have a new type (IncompleteRange), a new set of operators 
>> (prefix and postfix range operators), and these muddied semantics for 
>> something that can be written `subscript(upTo upperBound: Index) -> 
>> SubSequence { ... }`? _That_ has unmistakable semantics and requires no new 
>> syntax.
> 
> Arguing that it adds too much complexity relative to 

  1   2   3   >