for example: caféz the offset index of z is always 4. Which means the 4-th character of the string. You can always use s[s.index(s.startIndex, offsetBy:4)] to access the z. but the encodedOffset index of z maybe 16 or 20. This is not the offset concept of the collection, but the encoded offset concept of UTF-16.
> 在 2017年12月15日,上午9:25,Cao, Jiannan via swift-dev <swift-dev@swift.org> 写道: > > This offset is unicode offset, is not the offset of element. > For example: index(startIndex, offsetBy:1) is encodedOffset 4 or 8, not 1. > > Offset indexable is based on the offset of count of each element/index. it is > the same result of s.index(s.startIndex, offsetBy:i) > The encodedOffset is the underlaying offset of unicode string, not the same > concept of the offset index of collection. > > The offset indexable is meaning to the elements and index of collection (i-th > element of the collection), not related to the unicode offset (which is the > underlaying data offset meaning to the UTF-16 String). > > These two offset is totally different. > > Best, > Jiannan > >> 在 2017年12月15日,上午9:17,Michael Ilseman <milse...@apple.com >> <mailto:milse...@apple.com>> 写道: >> >> >> >>> On Dec 14, 2017, at 4:49 PM, Cao, Jiannan via swift-dev >>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote: >>> >>> People used to the offset index system instead of the String.Index. Using >>> offset indices to name the elements, count the elements is normal and >>> nature. >>> >> >> The offset system that you’re referring to is totally available in String >> today, if you’re willing for it to be the offset into the encoding. That’s >> the offset “people” you’re referring to are likely used to and consider >> normal and natural. On String.Index, there is the following: >> >> init(encodedOffset offset: Int >> <https://developer.apple.com/documentation/swift/int>) >> >> and >> >> var encodedOffset: Int <https://developer.apple.com/documentation/swift/int> >> { get } >> >> >> [1] https://developer.apple.com/documentation/swift/string.index >> <https://developer.apple.com/documentation/swift/string.index> >> >> >>> This offset index system has a long history and a real meaning to the >>> collection. The subscript s[i] has a fix meaning of "getting the i-th >>> element in this collection", which is normal and direct. Get the range with >>> offset indices, is also direct. It means the substring is from the i-th >>> character up to the j-th character of the original string. >>> >>> People used to play subscript, range with offset indices. Use >>> string[string.index(i, offsetBy: 5)] is not as directly and easily as >>> string[i + 5]. Also the Range<String.Index> is not as directly as >>> Range<Offset>. Developers need to transfer the Range<String.Index> result >>> of string.range(of:) to Range<OffsetIndex> to know the exact range of the >>> substring. Range<String.Index> has a real meaning to the machine and >>> underlaying data location for the substring, but Range<OffsetIndex> also >>> has a direct location information for human being, and represents the >>> abstract location concept of the collection (This is the most UNIMPEACHABLE >>> REASON I could provide). >>> >>> Offset index system is based on the nature of collection. Each element of >>> the collection could be located by offset, which is a direct and simple >>> conception to any collection. Right? Even the String with String.Index has >>> some offset index property within it. For example: the `count` of the >>> String, is the offset index of the endIndex.The enumerated() generated a >>> sequence with elements contains the same offset as the offset index system >>> provided. And when we apply Array(string), the string divided by each >>> character and make the offset indices available for the new array. >>> >>> The offset index system is just an assistant for collection, not a >>> replacement to String.Index. We use String.Index to represent the normal >>> underlaying of the String. We also could use offset indices to represent >>> the nature of the Collection with its elements. Providing the offset index >>> as a second choice to access elements in collections, is not only for the >>> String struct, is for all collections, since it is the nature of the >>> collection concept, and developer could choose use it or not. >>> >>> We don't make the String.Index O(1), but translate the offset indices to >>> the underlaying String.Index. Each time using subscript with offset index, >>> we just need to translate offset indices to underlaying indices using >>> c.index(startIndex, offsetBy:i), c.distance(from: startIndex, to:i) >>> >>> We can make the offset indices available through extension to Collection >>> (as my GitHub repo demo: >>> https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable- >>> <https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable->). >>> >>> or we could make it at compile time: >>> for example >>> >>> c[1...] >>> compile to >>> c[c.index(startIndex, offsetBy:1)...] >>> >>> let index: Int = s.index(of: "a") >>> compile to >>> let index: Int = s.distance(from: s.startIndex, to: s.index(of:"a")) >>> >>> let index = 1 // if used in s only >>> s[index..<index+2] >>> compile to >>> let index = s.index(s.startIndex, offsetBy: 1) >>> s[index..<s.index(index, offsetBy: 2)] >>> >>> let index = 1 // if used both in s1, s2 >>> s1[index..<index+2] >>> s2[index..<index+2] >>> compile to >>> let index = 1 >>> let index1 = s1.index(s.startIndex, offsetBy: index) >>> let index2 = s2.index(s.startIndex, offsetBy: index) >>> s1[index1..<s.index(index1, offsetBy: 2)] >>> s2[index2..<s.index(index2, offsetBy: 2)] >>> >>> I really want the team to consider providing the offset index system as an >>> assistant to the collection. It is the very necessary basic concept of >>> Collection. >>> >>> Thanks! >>> Jiannan >>> >>>> 在 2017年12月15日,上午2:13,Jordan Rose <jordan_r...@apple.com >>>> <mailto:jordan_r...@apple.com>> 写道: >>>> >>>> We really don't want to make subscripting a non-O(1) operation. That just >>>> provides false convenience and encourages people to do the wrong thing >>>> with Strings anyway. >>>> >>>> I'm always interested in why people want this kind of ability. Yes, it's >>>> nice for teaching programming to be able to split strings on character >>>> boundaries indexed by integers, but where does it come up in real life? >>>> The most common cases I see are trying to strip off the first or last >>>> character, or a known prefix or suffix, and I feel like we should have >>>> better answers for those than "use integer indexes" anyway. >>>> >>>> Jordan >>>> >>>> >>>>> On Dec 13, 2017, at 22:30, Cao, Jiannan via swift-dev >>>>> <swift-dev@swift.org <mailto:swift-dev@swift.org>> wrote: >>>>> >>>>> Hi, >>>>> >>>>> I would like to discuss the String.Index problem within Swift. I know the >>>>> current situation of String.Index is based on the nature of the >>>>> underlaying data structure of the string. >>>>> >>>>> But could we just make String.Index contain offset information? Or make >>>>> offset index subscript available for accessing character in String? >>>>> >>>>> for example: >>>>> let a = "01234" >>>>> print(a[0]) // 0 >>>>> print(a[0...4]) // 01234 >>>>> print(a[...]) // 01234 >>>>> print(a[..<2]) // 01 >>>>> print(a[...2]) // 012 >>>>> print(a[2...]) // 234 >>>>> print(a[2...3]) // 23 >>>>> print(a[2...2]) // 2 >>>>> if let number = a.index(of: "1") { >>>>> print(number) // 1 >>>>> print(a[number...]) // 1234 >>>>> } >>>>> >>>>> >>>>> 0 equals to Collection.Index of collection.index(startIndex, offsetBy: 0) >>>>> 1 equals to Collection.Index of collection.index(startIndex, offsetBy: 1) >>>>> ... >>>>> we keep the String.Index, but allow another kind of index, which is >>>>> called "offsetIndex" to access the String.Index and the character in the >>>>> string. >>>>> Any Collection could use the offset index to access their element, >>>>> regarding the real index of it. >>>>> >>>>> I have make the Collection OffsetIndexable protocol available here, and >>>>> make it more accessible for StringProtocol considering all API related to >>>>> the index. >>>>> >>>>> https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable- >>>>> >>>>> <https://github.com/frogcjn/OffsetIndexableCollection-String-Int-Indexable-> >>>>> >>>>> If someone want to make the offset index/range available for any >>>>> collection, you just need to extend the collection: >>>>> extension String : OffsetIndexableCollection { >>>>> } >>>>> >>>>> extension Substring : OffsetIndexableCollection { >>>>> } >>>>> >>>>> >>>>> I hope the Swift core team could consider bring the offset index to >>>>> string, or make it available to other collection, thus let developer to >>>>> decide whether their collection could use offset indices as an assistant >>>>> for the real index of the collection. >>>>> >>>>> >>>>> Thanks! >>>>> Jiannan >>>>> >>>>> >>>>> _______________________________________________ >>>>> swift-dev mailing list >>>>> swift-dev@swift.org <mailto:swift-dev@swift.org> >>>>> https://lists.swift.org/mailman/listinfo/swift-dev >>>>> <https://lists.swift.org/mailman/listinfo/swift-dev> >>>> >>> >>> _______________________________________________ >>> swift-dev mailing list >>> swift-dev@swift.org <mailto:swift-dev@swift.org> >>> https://lists.swift.org/mailman/listinfo/swift-dev >>> <https://lists.swift.org/mailman/listinfo/swift-dev> >> > > _______________________________________________ > swift-dev mailing list > swift-dev@swift.org > https://lists.swift.org/mailman/listinfo/swift-dev
_______________________________________________ swift-dev mailing list swift-dev@swift.org https://lists.swift.org/mailman/listinfo/swift-dev