> In order to be able to write extensions accross both String and Substring, a
> new Unicode protocol to which the two types will conform will be introduced.
> For the purposes of this proposal, Unicode will be defined as a protocol to
> be used whenver you would previously extend String. It should be possible to
> substitute extension Unicode { ... } in Swift 4 wherever extension String {
> ... } was written in Swift 3, with one exception: any passing of self into an
> API that takes a concrete String will need to be rewritten as String(self).
> If Self is a String then this should effectively optimize to a no-op, whereas
> if Self is a Substring then this will force a copy, helping to avoid the
> “memory leak” problems described above.
Did you consider an AnyUnicode<Encoding> wrapper? Then we could have a
typealias called “AnyString”.
Also, regarding naming: “Unicode” is great if this was a namespace, and this
proposal is a great example of why protocol nesting is badly needed in Swift
code which defines (not even very complex) protocols. However, absent protocol
nesting, I think “UnicodeEncoded” is better. It doesn’t roll off the tongue as
nicely, perhaps, but it also doesn’t look as weird when written in code.
> The exact nature of the protocol – such as which methods should be protocol
> requirements vs which can be implemented as protocol extensions, are
> considered implementation details and so not covered in this proposal.
>
I’d hope they do get a proposal at some stage, though. There are cases where
I’d like to be able to write my own “Unicode” type and take advantage of
generic (and existential when we can) text processing.
For example, maybe the thing I want to present as a single block of text is
actually pieced together from multiple discontiguous regions of a buffer (i.e.
the “buffer-gap” approach for faster random insertions/deletions, if I expect
my code to be doing lots of that).
You could imagine that if something like CoreText (can’t speak for them, of
course) were being rewritten in Swift, it would be able to compute layouts and
render glyphs from any provider of unicode data and not just String or
Substring. I mean, that’s my dream, anyway. It would mean you could go directly
from a buffer-gap String to a rendered bitmap suitable for UI.
> Unicode will conform to BidirectionalCollection. RangeReplaceableCollection
> conformance will be added directly onto the String and Substring types, as it
> is possible future Unicode-conforming types might not be range-replaceable
> (e.g. an immutable type that wraps a const char *).
>
+1. Keep the protocol focussed.
> The standard library currently lacks a Latin1 codec, so a enum Latin1:
> UnicodeEncoding type will be added.
>
I feel this is a call for better naming somewhere.
> init<Encoding: UnicodeEncoding>(
> cString nulTerminatedCodeUnits: UnsafePointer<Encoding.CodeUnit>,
> encoding: Encoding)
So will this replace the stuff which Foundation puts in to String, which also
decodes a C string in to Swift string?
Foundation includes more encodings (and also nests an “Encoding” enum in String
itself, which makes things even more confusing), but totally ignores standard
library decodes in favour of CF ones.
- Karl
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution