on Mon Jan 23 2017, Félix Cloutier <[email protected]> wrote:
>> Le 23 janv. 2017 à 20:45, Dave Abrahams <[email protected]> a > écrit : >> >> >> >> >> >> On Jan 22, 2017, at 9:54 PM, Félix Cloutier <[email protected] > <mailto:[email protected]>> wrote: >> >>> >>>>> doesn't necessarily mean that ignoring that case is the right > thing to do. In fact, it means that Unicode won't do anything to > protect programs against these, and if Swift doesn't, chances are that > no one will. Isolated combining characters break a number of > expectations that developers could reasonably have: >>>>> >>>>> (a + b).count == a.count + b.count >>>>> (a + b).startsWith(a) >>>>> (a + b).endsWith(b) >>>>> (a + b).find(a) // or .find(b) >>>>> >>>>> Of course, this can be documented, but people want easy, and >>>>> documentation is hard. >>>> >>>> Yes. Unfortunately they also want the ability to append a string >>>> consisiting of a combining character to another string and have it >>>> append. And they don't want to be prevented from forming >>>> valid-but-defective Unicode strings. >>>> >>>> […] >>>> >>>> Can you suggest an alternative that doesn't violate the Unicode >>>> standard and supports the expected use-cases, somehow? >>> >>> >>> I'm not sure I understand. Did we go from "this is a >>> degenerate/defective >>> <https://github.com/apple/swift/blob/master/docs/StringManifesto.md#string-should-be-a-collection-of-characters-again> >>> case that we shouldn't bother with" to "this is a supported use case >>> that needs to work as-is"? >> >> No. The Unicode standard says it's a valid string, so we shouldn't >> prohibit it. The standard also says it's a corner case for which it >> isn't worth making heroic efforts to create sensible semantics. It's >> totally in keeping with the Unicode standards that we treat it as >> proposed. >> >> In a domain as complex as String processing, we need a guiding star, >> and that star is the Unicode standard. I'm very reluctant to do >> anything that clashes with the spirit of the standard. >> >>> I've never seen anyone start a string with a combining character on >>> purpose, >> >> It will occur as a byproduct of the process of attaching a diacritic >> to a base character. > > Unless you're in the business of writing a text editor, I don't know > if that's a common use case. I don't either, to be honest. But the experts I consult with keep reassuring me that it's an important one. >>> though I'm familiar with just one natural language that needs >>> combining characters. I can imagine that it could be a convenient >>> feature in other natural languages. >>> >>> However, if Swift Strings are now designed for machine processing >>> and less for human language convenience, for me, it's easy enough to >>> justify a safe default in the context of machine processing: `a+b` >>> will not combine the end of `a` with the start of `b`. You could do >>> this by inserting a ◌ that `b` could combine with if necessary. >> >> You can do it, but it trades one semantic problem for a usability >> problem, without solving all the semantic problems: you end up with >> a.count + b.count == (a+b).count, sure, but you still don't satisfy >> the usual law of collections that (a+b).contains(b.first!) if b is >> non-empty, and now you've made it difficult to attach diacritics to >> base characters. > > "Difficult". > > What kind of processing would you suggest on a variable "b" in the > expression "\(a),\(b)" to ensure that the result can be split with a > comma? I'm sorry, I don't understand what you're driving at, here. -- -Dave _______________________________________________ swift-evolution mailing list [email protected] https://lists.swift.org/mailman/listinfo/swift-evolution
