The ultimate model of strings is going to be complicated whether or not it’s on String itself, although I argue that regardless of that complexity, Swift inherently starts from a much better place than f.ex. Java from just having Array vs. 30 different Array-like things. That dovetails into the point I was trying to make up-thread, which is that complicating the overall type space to serve specific use cases practically results in less-experienced users not knowing about or not finding it, even when they need to. Furthermore, “use UTF8String when you need to to be super-fast (and don’t we all want to be super fast???)” is the kind of cargo-culting that sticks, not “when caveats A, B, C, and D apply and you want to be fast and you’ve considered all the Unicode implications and when the optimizer breaks down and you have observed a performance problem you should consider etc etc etc”.
> On Jan 25, 2017, at 4:21 PM, Ben Cohen <[email protected]> wrote: > > >> On Jan 24, 2017, at 8:16 PM, Zach Waldowski via swift-evolution >> <[email protected] <mailto:[email protected]>> wrote: >> >> I strongly want Swift to have world-class string processing, but I believe >> even more strongly in the language's spirit of progressive disclosure. >> Newcomers to Swift's current String API find it difficult (something I >> personally disagree with, but that's neither here nor there); I don't think >> that difficulty is solved by aggressively use-specific type modeling. I >> instead think it gives rise to the same severe cargo-culting that gets us >> the scarily prevalent String.Index.init(offset:) extensions in the current >> model. > > This cuts both ways though. In the spirit of progressive disclosure, should > we complicate String’s model for users in order for it to accommodate both > UTF8 and UTF16 backing stores? > > If String can be UTF8-backed, that would mean that we could not tag the UTF16 > collection view as conforming to RandomAccessCollection. That would mean you > couldn’t use algorithms that relied on random access on it. It would exhibit > random access characteristics sometimes – UTF16View.index(:offsetBy) would > run in constant time when the string was backed by UTF16, but when backed by > UTF8, it would run in linear time. Given, as we’ve discussed here, you need > to do these kind of index calculations sometimes to interoperate with APIs > that traffic in code unit offsets, what do we need to tell users about > performance when they need to do it? That "it’s probably OK unless caveat > caveat caveat"? > > On the other hand, if we separate UTF8-backed strings into another type, we > can keep String simple. Then for those power users who really absolutely must > operate on a UTF8-backed string because of their performance needs, they have > another type, which they can progressively discover when they find they need > it. > > I’m not saying this is enough to rule out UTF8-backed strings, but I don’t > think “it’ll be a simpler model for most users” is the argument in favor of > it. > >
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
