Hi Brent, Sorry, I realized I failed to reply to these at the time. See below.
> On Mar 30, 2017, at 6:52 PM, Brent Royal-Gordon <[email protected]> > wrote: > >> On Mar 30, 2017, at 2:36 PM, Ben Cohen <[email protected] >> <mailto:[email protected]>> wrote: >> >> The big win for Unicode is it is short. We want to encourage people to write >> their extensions on this protocol. We want people who previously extended >> String to feel very comfortable extending Unicode. It also helps emphasis >> how important the Unicode-ness of Swift.String is. I like the idea of >> Unicode.Collection, but it is a little intimidating and making it even a >> tiny bit intimidating is worrying to me from an adoption perspective. > > Yeah, I understand why "Collection" might be intimidating. But I think > "Unicode" would be too—it's opaque enough that people wouldn't be entirely > sure whether they were extending the right thing. > > I did a quick run-through of different language and the > protocols/interfaces/whatever their string types conform to, but most don't > seem to have anything that abstracts string types. The only similar things I > could find were `CharSequence` in Java, `StringLike` in Scala...and `Stringy` > in Perl 6. And I'm sure you thought you were joking! > Ha! > Honestly, I'd recommend just going with `StringProtocol` unless you can come > up with an adjective form you like (`Stringlike`? `Textual`?). It's a bit > clumsy, but it's crystal clear. Stupid name, but you'll never forget it. > I think it’s kind of evenly balanced between Unicode and StringProtocol. Neither are perfect. >>> I'm a little worried about this because it seems to imply that the protocol >>> cannot include any mutation operations that aren't in >>> `RangeReplaceableCollection`. For instance, it won't be possible to include >>> an in-place `applyTransform` method in the protocol. Do you anticipate that >>> being an issue? Might it be a good idea to define a parallel `Mutable` or >>> `RangeReplaceable` protocol? >>> >> >> You can always assign to self. Then provide more efficient implementations >> where RangeReplaceableCollection. We do this elsewhere in the std lib with >> collections e.g. >> https://github.com/apple/swift/blob/master/stdlib/public/core/Collection.swift#L1277 >> >> <https://github.com/apple/swift/blob/master/stdlib/public/core/Collection.swift#L1277>. >> >> Proliferating protocol combinations is problematic (looking at you, >> BidirectionalMutableRandomAccessSlice). > > Nobody likes proliferation, but in this case it'd be because there genuinely > were additional semantics that were only available on mutable strings. > > (Once upon a time, I think I requested the ability to write `func index(of > elem: Iterator.Element) -> Index? where Iterator.Element: Equatable`. Could > such a feature be used for this? `func apply(_ transform: StringTransform, > reverse: Bool) where Self: RangeReplaceableCollection`?) > >>>> The C string interop methods will be updated to those described here: a >>>> single withCString operation and two init(cString:) constructors, one for >>>> UTF8 and one for arbitrary encodings. >>> >>> Sorry if I'm repeating something that was already discussed, but is there a >>> reason you don't include a `withCString` variant for arbitrary encodings? >>> It seems like an odd asymmetry. >> >> Hmm. Is this a common use-case people have? Symmetry for the sake of it >> doesn’t seem enough. If uncommon, you can do it via an Array that you >> nul-terminate manually. > > Is `init(cString:encoding:)` a common use case? If it is, I'm not sure why > the opposite wouldn't be. > This + another use case has convinced me that yes, we should have a matching withCString version. >> Yeah, it’s tempting to make ParseResult general, and the only reason we held >> off is because we don’t want making sure it’s generally useful to be a >> distraction. > > Understandable. > > I wonder if some part of the parsing algorithm could somehow be generalized > so it was suitable for many purposes and then put on `Collection`, with the > `UnicodeEncoding` then being passed as a parameter to it. If so, that would > justify making `ParseResult` a top-level type. > >> Ah, yes. Here it is: >> >> public protocol EncodedScalarProtocol : RandomAccessCollection { >> init?(_ scalarValue: UnicodeScalar) >> var utf8: UTF8.EncodedScalar { get } >> var utf16: UTF16.EncodedScalar { get } >> var utf32: UTF32.EncodedScalar { get } >> } > > What is the `Element` type expected to be here? > > I think what's missing is a holistic overview of the encoding system. So, > please help me write this function: > > func unicodeScalars<Encoding: UnicodeEncoding>(in data: Data, using > encoding: Encoding.Type) -> [UnicodeScalar] { > var scalars: [UnicodeScalar] = [] > > data.withUnsafeBytes { (bytes: > UnsafePointer<$ParseInputElement>) in > let buffer = UnsafeBufferPointer(start: bytes, count: > data.count / MemoryLayout<$ParseInputElement>.size) > encoding.parseForward(buffer) { encodedScalar in > let unicodeScalar: UnicodeScalar = > $doSomething(encodedScalar) > scalars.append(unicodeScalar) > } > } > > return scalars > } > > What type would I put for $ParseInputElement? What function or initializer do > I call for $doSomething? > Will come back on this. >>>> @discardableResult >>>> public static func parseForward<C: Collection>( >>>> _ input: C, >>>> repairingIllFormedSequences makeRepairs: Bool = true, >>>> into output: (EncodedScalar) throws->Void >>>> ) rethrows -> (remainder: C.SubSequence, errorCount: Int) >>> >>> Are there constraints missing on `parseForward`? >>> >> >> Yep – see the note that appears a little later. They’re really >> implementation details – so not something to capture in the proposal – which >> may or may not be needed depending on whether this lands before or after the >> generics features that make them redundant. > > No, I mean because this says nothing about `C`'s element type. Presumably you > can't parse a bunch of `UIView`s into Unicode scalars, so there must be some > kind of constraint on the collection's elements. What is it? > > ...oh, I notice that `parseScalarForward(_:knownCount:)` has the clause > `where C.Iterator.Element == EncodedScalar.Iterator.Element` attached. Should > that also be attached to `parseForward(_:repairingIllFormedSequences:into:)`? > >>> What do these do if `makeRepairs` is false? Would it be clearer if we made >>> an enum that described the behaviors and changed the label to something >>> like `ifIllFormed:`? >> >> The Unicode standard specifies values to substitute when making repairs. > > I'm asking what happens if you *don't* want to make repairs. Does it, say, > stop immediately, returning an `errorCount` of `1` and a `remainder` that > starts at the site of the error? If so, would we better off having that > parameter be something like `ifIllFormed: .stop` or `ifIllFormed: .repair`, > rather than `repairingIllFormedSequences: false` or > `repairingIllFormedSequences: true`? > The idea is, if you don’t want to make repairs, you use the transcoding primitives instead. The belief is that the old non-repairing versions (return nil if repairs needed) weren’t useful. >>>> Due to the change in internal implementation, this means that these >>>> operations will be O(n) rather than O(1). This is not expected to be a >>>> major concern, based on experiences from a similar change made to Java, >>>> but projects will be able to work around performance issues without >>>> upgrading to Swift 4 by explicitly typing slices as Substring, which will >>>> call the Swift 4 variant, and which will be available but not invoked by >>>> default in Swift 3 mode. >>> >>> Will there be a way to make this also work with a real Swift 3 compiler? >>> For instance, can you define `typealias Substring = String` in such a way >>> that real Swift 3 will parse and use it, but Swift 4 in Swift 3 mode will >>> ignore it? >> >> Are you talking about this as a way for people to change their code, while >> still being able to compile their code with the old compiler? Yes, that >> might be a good strategy, will think about that. > > Yes, that's what I'm talking about. > > I guess the actual question is, does `#if swift(>=4)` come out as `true` for > Swift 4 in Swift 3 mode? If not, is there some way to detect that you're > using Swift 4 in Swift 3 mode? (I suppose one answer is "yes, Swift 4 in > Swift 3 mode is called Swift 3.2"; I just haven't heard anyone mention > anything like that yet.) In either case, if there's some way to distinguish, > you could say: > > #if thisIsRealSwift3NotSwift4PretendingToBeSwift3() > typealias Substring = String > #endif > > And then you could write the rest of your code using `Substring` and it would > compile using both Swift 3 and Swift 4 toolchains, never forcing an implicit > copy. > Ah right. Unfortunately as things are currently envisioned, this won’t work – you won’t be able to distinguish “true” Swift 3 from Swift 3 compatibility mode. > -- > Brent Royal-Gordon > Architechies >
_______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
