I'll use Karl's point here as a minor jumping-off point for a semi- related train of thought… I'm excited by the content of the original manifesto, including a powerful Unicode namespace and types. But as I've continued down the thread, I've had growing concern about modeling strings breadthwise in the type system i.e., with UTF8String and so on.
I strongly want Swift to have world-class string processing, but I believe even more strongly in the language's spirit of progressive disclosure. Newcomers to Swift's current String API find it difficult (something I personally disagree with, but that's neither here nor there); I don't think that difficulty is solved by aggressively use- specific type modeling. I instead think it gives rise to the same severe cargo-culting that gets us the scarily prevalent String.Index.init(offset:) extensions in the current model. Best Zach Waldowski [email protected] On Tue, Jan 24, 2017, at 10:15 PM, Karl Wagner via swift-evolution wrote: > >> >>> I hope I am correct about the no-copy thing, and I would also >>> like to >>> permit promoting C strings to Swift strings without >>> validation. This >>> is obviously unsafe in general, but I know my strings... and I care >>> about performance. ;) >> >> We intend to support that use-case. That's part of the reason >> for the >> ValidUTF8 and ValidUTF16 encodings you see here: >> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598 >> and here: >> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862 > > It seems a little strange to me that a pre-validated UTF8 string from > C would have different types to a UTF8String (i.e. using ValidUTF8 vs > UTF8). It defeats the point of having the encoding represented in the > type-system. > > For example, if I write a generic function: > >> func sendMessage<Source: Unicode where Source.Encoding == UTF8>(from: >> Source) > > I would only be able to accept UTF-8 text which hasn’t already been > validated. > > What about if we allowed each encoding to provide multiple kinds of > decoder? That would also allow us to substitute our own decoders in, > if there are application-specific shortcuts we can take. > >> protocol UnicodeEncoding { >> associatedtype CodeUnit >> >> associatedtype ValidatingDecoder: UnicodeDecoder >> associatedtype NonValidatingDecoder: UnicodeDecoder >> } >> >> protocol UnicodeDecoder { >> associatedtype Encoding: UnicodeEncoding >> associatedtype DecodedScalar: RandomAccessCollection where >> Iterator.Element == Encoding.CodeUnit >> >> static func parse1Forward<C>(…) -> ParseResult<DecodedScalar, >> C.Index> >> static func parse1Backward<C>(…) -> ParseResult<DecodedScalar, >> C.Index> >> } >> // Not shown: UnicodeEncoder protocol, with transcodeScalar<T> >> function. >> >> struct UTF8: UnicodeEncoding { >> typealias CodeUnit = UInt8 >> typealias ValidatingDecoder = ValidatingUTF8Decoder >> typealias NonValidatingDecoder = NonValidatingUTF8Decoder >> } >> >> struct NonValidatingUTF8Decoder: UnicodeDecoder { >> typealias Encoding = UTF8 >> struct DecodedScalar: RandomAccessCollection { … } >> // Parsing functions >> } >> >> struct ValidatingUTF8Decoder: UnicodeDecoder { >> typealias Encoding = UTF8 >> typealias DecodedScalar = NonValidatingUTF8Decoder.DecodedScalar >> // newtype would be cool here >> // Parsing functions >> } >> >> struct String { >> init<C, Encoding, Decoder>(from: C, encodedAs: Encoding, using: >> Decoder = Encoding.ValidatingDecoder) >> where C: Collection, C.Iterator.Element == Encoding.CodeUnit, >> Decoder.Encoding == Encoding { >> >> // transcode to native String encoding using ‘Decoder’ we >> were given >> } >> } > > - Karl > _________________________________________________ > swift-evolution mailing list > [email protected] > https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
