Re: [swift-users] Unexpected results when using String.CharacterView.Index

2017-03-09 Thread Zhao Xin via swift-users
Thanks a lot, Ole. I understand now.

Zhaoxin

On Thu, Mar 9, 2017 at 7:54 PM, Ole Begemann  wrote:

> On 09/03/2017 08:27, Zhao Xin via swift-users wrote:
>
>> When using subscript of `String.CharacterView`, I got an unexpected error.
>>
>> fatal error: Can't form a Character from an empty String
>>
>> func test() {
>> let s = "Original Script:"
>> let cs = s.characters
>> //let startIndex = cs.startIndex
>> let nextIndex = "Original ?".characters.endIndex
>> let nextCharacter = cs[nextIndex]// above error
>> }
>>
>> test()
>>
>
> First of all, it's not guaranteed that an index derived from one string
> can be used to subscript another string. Don't rely on that.
>
> endIndex is also different, and this is why you're seeing a crash here.
> Let's inspect nextIndex with dump(nextIndex):
>
> ▿ Swift.String.CharacterView.Index
>   ▿ _base: Swift.String.UnicodeScalarView.Index
> - _position: 10
>   - _countUTF16: 0
>
> You see that _countUTF16 is 0, i.e. internally, String.CharacterView
> assigns its endIndex a length of 0 (in terms of UTF-16 code units). This is
> why it traps when you use the index for subscripting. The endIndex is not a
> valid index for subscripting, not for the string it was derived from and
> not for any other string.
>
> ​However, if I chose​ another way to get the nextIndex. It works.
>>
>> functest() {
>> let s = "Original Script:"
>> let cs = s.characters
>> let startIndex = cs.startIndex
>> //let nextIndex = "Original ?".characters.endIndex
>> let nextIndex01 = cs.index(startIndex, offsetBy: "Original
>> ?".characters.count)
>> let nextCharacter = cs[nextIndex01]
>> }
>>
>> test()
>>
>
> Here, dump(nextIndex01) prints this:
>
> ▿ Swift.String.CharacterView.Index
>   ▿ _base: Swift.String.UnicodeScalarView.Index
> - _position: 10
>   - _countUTF16: 1
>
> Notice that _countUTF16 is 1, so it looks like a valid index from the
> perspective of cs. But again, don't rely on this! The results of
> subscripting a collection with an index derived from another collection are
> undefined unless the collection explicitly documents otherwise.
>
> Further more, I compared the two `nextIndex`. They were equal.
>>
>> functest() {
>> let s = "Original Script:"
>> let cs = s.characters
>> let startIndex = cs.startIndex
>> let nextIndex = "Original ?".characters.endIndex
>> let nextIndex01 = cs.index(startIndex, offsetBy: "Original
>> ?".characters.count)
>> let nextCharacter = cs[nextIndex01]
>> print(nextIndex01 == nextIndex) // true
>> }
>>
>> test()
>>
>
> It looks like String.Index only takes the position into account to
> determine equality, not its _countUTF16. This makes sense for the way
> endIndex and index(_:offsetBy:) are implemented. After all, nextIndex and
> nextIndex01 _should be equal_. It would certainly be possible to implement
> it differently (where endIndex and index(_:offsetBy:) returned identical
> indices, including _countUTF16:) and I don't know why the stdlib team chose
> to do it this way (maybe performance?).
>
> In any case, much of this implementation may change with the work going
> into strings for Swift 4.
>
>
>
___
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


Re: [swift-users] Unexpected results when using String.CharacterView.Index

2017-03-09 Thread Ole Begemann via swift-users

On 09/03/2017 08:27, Zhao Xin via swift-users wrote:

When using subscript of `String.CharacterView`, I got an unexpected error.

fatal error: Can't form a Character from an empty String

func test() {
let s = "Original Script:"
let cs = s.characters
//let startIndex = cs.startIndex
let nextIndex = "Original ?".characters.endIndex
let nextCharacter = cs[nextIndex]// above error
}

test()


First of all, it's not guaranteed that an index derived from one string 
can be used to subscript another string. Don't rely on that.


endIndex is also different, and this is why you're seeing a crash here. 
Let's inspect nextIndex with dump(nextIndex):


▿ Swift.String.CharacterView.Index
  ▿ _base: Swift.String.UnicodeScalarView.Index
- _position: 10
  - _countUTF16: 0

You see that _countUTF16 is 0, i.e. internally, String.CharacterView 
assigns its endIndex a length of 0 (in terms of UTF-16 code units). This 
is why it traps when you use the index for subscripting. The endIndex is 
not a valid index for subscripting, not for the string it was derived 
from and not for any other string.



​However, if I chose​ another way to get the nextIndex. It works.

functest() {
let s = "Original Script:"
let cs = s.characters
let startIndex = cs.startIndex
//let nextIndex = "Original ?".characters.endIndex
let nextIndex01 = cs.index(startIndex, offsetBy: "Original
?".characters.count)
let nextCharacter = cs[nextIndex01]
}

test()


Here, dump(nextIndex01) prints this:

▿ Swift.String.CharacterView.Index
  ▿ _base: Swift.String.UnicodeScalarView.Index
- _position: 10
  - _countUTF16: 1

Notice that _countUTF16 is 1, so it looks like a valid index from the 
perspective of cs. But again, don't rely on this! The results of 
subscripting a collection with an index derived from another collection 
are undefined unless the collection explicitly documents otherwise.



Further more, I compared the two `nextIndex`. They were equal.

functest() {
let s = "Original Script:"
let cs = s.characters
let startIndex = cs.startIndex
let nextIndex = "Original ?".characters.endIndex
let nextIndex01 = cs.index(startIndex, offsetBy: "Original
?".characters.count)
let nextCharacter = cs[nextIndex01]
print(nextIndex01 == nextIndex) // true
}

test()


It looks like String.Index only takes the position into account to 
determine equality, not its _countUTF16. This makes sense for the way 
endIndex and index(_:offsetBy:) are implemented. After all, nextIndex 
and nextIndex01 _should be equal_. It would certainly be possible to 
implement it differently (where endIndex and index(_:offsetBy:) returned 
identical indices, including _countUTF16:) and I don't know why the 
stdlib team chose to do it this way (maybe performance?).


In any case, much of this implementation may change with the work going 
into strings for Swift 4.



___
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users


[swift-users] Unexpected results when using String.CharacterView.Index

2017-03-08 Thread Zhao Xin via swift-users
When using subscript of `String.CharacterView`, I got an unexpected error.

fatal error: Can't form a Character from an empty String


func test() {

let s = "Original Script:"

let cs = s.characters

//let startIndex = cs.startIndex

let nextIndex = "Original ?".characters.endIndex

let nextCharacter = cs[nextIndex]// above error

}


test()

​However, if I chose​ another way to get the nextIndex. It works.

func test() {

let s = "Original Script:"

let cs = s.characters

let startIndex = cs.startIndex

//let nextIndex = "Original ?".characters.endIndex

let nextIndex01 = cs.index(startIndex, offsetBy: "Original ?".characters
.count)

let nextCharacter = cs[nextIndex01]

}


test()

Further more, I compared the two `nextIndex`. They were equal.

func test() {

let s = "Original Script:"

let cs = s.characters

let startIndex = cs.startIndex

let nextIndex = "Original ?".characters.endIndex

let nextIndex01 = cs.index(startIndex, offsetBy: "Original ?".characters
.count)

let nextCharacter = cs[nextIndex01]



print(nextIndex01 == nextIndex) // true

}


test()


So I wonder, is there a bug here?


Xcode 8.2.1 (8C1002), Swift 3.0.2


Zhaoxin
___
swift-users mailing list
swift-users@swift.org
https://lists.swift.org/mailman/listinfo/swift-users