> On 23 Jan 2017, at 06:54, Félix Cloutier via swift-evolution 
> <[email protected]> wrote:
> 
> 
>>> doesn't necessarily mean that ignoring that case is the right thing to do. 
>>> In fact, it means that Unicode won't do anything to protect programs 
>>> against these, and if Swift doesn't, chances are that no one will. Isolated 
>>> combining characters break a number of expectations that developers could 
>>> reasonably have:
>>> 
>>> (a + b).count == a.count + b.count
>>> (a + b).startsWith(a)
>>> (a + b).endsWith(b)
>>> (a + b).find(a) // or .find(b)
>>> 
>>> Of course, this can be documented, but people want easy, and documentation 
>>> is hard.
>> 
>> Yes.  Unfortunately they also want the ability to append a string 
>> consisiting of a combining character to another string and have it append.  
>> And they don't want to be prevented from forming valid-but-defective Unicode 
>> strings.
>> 
>> […]
>> 
>> Can you suggest an alternative that doesn't violate the Unicode standard and 
>> supports the expected use-cases, somehow? 
> 
> 
> I'm not sure I understand. Did we go from "this is a degenerate/defective 
> <https://github.com/apple/swift/blob/master/docs/StringManifesto.md#string-should-be-a-collection-of-characters-again>
>  case that we shouldn't bother with" to "this is a supported use case that 
> needs to work as-is"? I've never seen anyone start a string with a combining 
> character on purpose, though I'm familiar with just one natural language that 
> needs combining characters. I can imagine that it could be a convenient 
> feature in other natural languages.
> 
> However, if Swift Strings are now designed for machine processing and less 
> for human language convenience, for me, it's easy enough to justify a safe 
> default in the context of machine processing: `a+b` will not combine the end 
> of `a` with the start of `b`. You could do this by inserting a ◌ that `b` 
> could combine with if necessary. That solution would make half of the cases 
> that I've mentioned work as expected and make the operation always safe, as 
> far as I can tell.
> 
> In that world, it would be a good idea to have a `&+` fallback or something 
> like that that will let characters combine. I would think that this is a much 
> less common use case than serializing strings, though.
> 
>>> My second concern is with how easy it is to convert an Int to a String 
>>> index. I've been vocal about this before: I'm concerned that Swift 
>>> developers will adequate Ints to random-access String iterators, which they 
>>> are emphatically not. String.Index(100) is proposed as a constant-time 
>>> operation
>> 
>> No, that has not been proposed.  It would be 
>> 
>> String.Index(codeUnitOffset: 100)
>> 
>> It's hard to strike a balance between keeping programmers from making 
>> mistakes and making the important use-cases easy.  Do you have any 
>> suggestions for improving on what we've proposed?
> 
> That's still one extension away from String.Index(100), and one function away 
> from an even more convenient form. I don't have a great solution, but I don't 
> have a great understanding of the problem that this is solving either. I'm 
> leaving it here because, AFAIK, Swift 3 imposes constraints that are hard to 
> ignore and mostly beneficial to people outside of the English bubble, and it 
> seems that the proposed index regresses on this.
> 
> I'm perfectly happy with interchanging indices between the different views of 
> a String. It's getting the offset in or out of the index that I think lets 
> people do incorrect assumptions about strings.

We could have a pair of helper functions to search for the grapheme cluster 
boundary relative to a given CodeUnit.Index:

/// Returns the index at the start of the grapheme-cluster containing the given 
code-unit.
func indexOfCharacterBoundary(at i: CodeUnits.Index) -> CodeUnits.Index

/// Returns the index at the start of the grapheme-cluster following the given 
code-unit.
func indexOfCharacterBoundary(after i: CodeUnits.Index) -> CodeUnits.Index

Actually, if we do forgiving conversion when sharing indexes between String 
views, it might be nice to expose these explicit index-adjusting functions 
anyway.
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to