> 
>> I hope I am correct about the no-copy thing, and I would also like to
>> permit promoting C strings to Swift strings without validation.  This
>> is obviously unsafe in general, but I know my strings... and I care
>> about performance. ;)
> 
> We intend to support that use-case.  That's part of the reason for the
> ValidUTF8 and ValidUTF16 encodings you see here:
> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598
>  
> <https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L598>
> and here:
> https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862
>  
> <https://github.com/apple/swift/blob/unicode-rethink/stdlib/public/core/Unicode2.swift#L862>

It seems a little strange to me that a pre-validated UTF8 string from C would 
have different types to a UTF8String (i.e. using ValidUTF8 vs UTF8). It defeats 
the point of having the encoding represented in the type-system.

For example, if I write a generic function:

func sendMessage<Source: Unicode where Source.Encoding == UTF8>(from: Source)

I would only be able to accept UTF-8 text which hasn’t already been validated. 

What about if we allowed each encoding to provide multiple kinds of decoder? 
That would also allow us to substitute our own decoders in, if there are 
application-specific shortcuts we can take.

protocol UnicodeEncoding {
  associatedtype CodeUnit

  associatedtype ValidatingDecoder: UnicodeDecoder
  associatedtype NonValidatingDecoder: UnicodeDecoder
}

protocol UnicodeDecoder {
    associatedtype Encoding: UnicodeEncoding
    associatedtype DecodedScalar: RandomAccessCollection where Iterator.Element 
== Encoding.CodeUnit

    static func parse1Forward<C>(…) -> ParseResult<DecodedScalar, C.Index>
    static func parse1Backward<C>(…) -> ParseResult<DecodedScalar, C.Index>
}
// Not shown: UnicodeEncoder protocol, with transcodeScalar<T> function.

struct UTF8: UnicodeEncoding  { 
    typealias CodeUnit             = UInt8  
    typealias ValidatingDecoder    = ValidatingUTF8Decoder
    typealias NonValidatingDecoder = NonValidatingUTF8Decoder
}

struct NonValidatingUTF8Decoder: UnicodeDecoder {
    typealias Encoding = UTF8
    struct DecodedScalar: RandomAccessCollection { … }
    // Parsing functions
}

struct ValidatingUTF8Decoder: UnicodeDecoder {
    typealias Encoding = UTF8
    typealias DecodedScalar = NonValidatingUTF8Decoder.DecodedScalar // newtype 
would be cool here
    // Parsing functions
}

struct String {
    init<C, Encoding, Decoder>(from: C, encodedAs: Encoding, using: Decoder = 
Encoding.ValidatingDecoder) 
        where C: Collection, C.Iterator.Element == Encoding.CodeUnit, 
Decoder.Encoding == Encoding {

         // transcode to native String encoding using ‘Decoder’ we were given
    }
}

- Karl
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to