The ultimate model of strings is going to be complicated whether or not it’s on 
String itself, although I argue that regardless of that complexity, Swift 
inherently starts from a much better place than f.ex. Java from just having 
Array vs. 30 different Array-like things. That dovetails into the point I was 
trying to make up-thread, which is that complicating the overall type space to 
serve specific use cases practically results in less-experienced users not 
knowing about or not finding it, even when they need to. Furthermore, “use 
UTF8String when you need to to be super-fast (and don’t we all want to be super 
fast???)” is the kind of cargo-culting that sticks, not “when caveats A, B, C, 
and D apply and you want to be fast and you’ve considered all the Unicode 
implications and when the optimizer breaks down and you have observed a 
performance problem you should consider etc etc etc”.

> On Jan 25, 2017, at 4:21 PM, Ben Cohen <[email protected]> wrote:
> 
> 
>> On Jan 24, 2017, at 8:16 PM, Zach Waldowski via swift-evolution 
>> <[email protected] <mailto:[email protected]>> wrote:
>> 
>> I strongly want Swift to have world-class string processing, but I believe 
>> even more strongly in the language's spirit of progressive disclosure. 
>> Newcomers to Swift's current String API find it difficult (something I 
>> personally disagree with, but that's neither here nor there); I don't think 
>> that difficulty is solved by aggressively use-specific type modeling. I 
>> instead think it gives rise to the same severe cargo-culting that gets us 
>> the scarily prevalent String.Index.init(offset:) extensions in the current 
>> model.
> 
> This cuts both ways though. In the spirit of progressive disclosure, should 
> we complicate String’s model for users in order for it to accommodate both 
> UTF8 and UTF16 backing stores?
> 
> If String can be UTF8-backed, that would mean that we could not tag the UTF16 
> collection view as conforming to RandomAccessCollection. That would mean you 
> couldn’t use algorithms that relied on random access on it. It would exhibit 
> random access characteristics sometimes  – UTF16View.index(:offsetBy) would 
> run in constant time when the string was backed by UTF16, but when backed by 
> UTF8, it would run in linear time. Given, as we’ve discussed here, you need 
> to do these kind of index calculations sometimes to interoperate with APIs 
> that traffic in code unit offsets, what do we need to tell users about 
> performance when they need to do it? That "it’s probably OK unless caveat 
> caveat caveat"?
> 
> On the other hand, if we separate UTF8-backed strings into another type, we 
> can keep String simple. Then for those power users who really absolutely must 
> operate on a UTF8-backed string because of their performance needs, they have 
> another type, which they can progressively discover when they find they need 
> it.
> 
> I’m not saying this is enough to rule out UTF8-backed strings, but I don’t 
> think “it’ll be a simpler model for most users” is the argument in favor of 
> it.
> 
> 

_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to