Sent from my iPad

> On Jan 23, 2017, at 4:08 AM, Karl Wagner <[email protected]> wrote:
> 
> 
>> On 23 Jan 2017, at 06:54, Félix Cloutier via swift-evolution 
>> <[email protected]> wrote:
>> 
>> 
>>>> doesn't necessarily mean that ignoring that case is the right thing to do. 
>>>> In fact, it means that Unicode won't do anything to protect programs 
>>>> against these, and if Swift doesn't, chances are that no one will. 
>>>> Isolated combining characters break a number of expectations that 
>>>> developers could reasonably have:
>>>> 
>>>> (a + b).count == a.count + b.count
>>>> (a + b).startsWith(a)
>>>> (a + b).endsWith(b)
>>>> (a + b).find(a) // or .find(b)
>>>> 
>>>> Of course, this can be documented, but people want easy, and documentation 
>>>> is hard.
>>> 
>>> Yes.  Unfortunately they also want the ability to append a string 
>>> consisiting of a combining character to another string and have it append.  
>>> And they don't want to be prevented from forming valid-but-defective 
>>> Unicode strings.
>>> 
>>> […]
>>> 
>>> Can you suggest an alternative that doesn't violate the Unicode standard 
>>> and supports the expected use-cases, somehow? 
>> 
>> 
>> I'm not sure I understand. Did we go from "this is a degenerate/defective 
>> case that we shouldn't bother with" to "this is a supported use case that 
>> needs to work as-is"? I've never seen anyone start a string with a combining 
>> character on purpose, though I'm familiar with just one natural language 
>> that needs combining characters. I can imagine that it could be a convenient 
>> feature in other natural languages.
>> 
>> However, if Swift Strings are now designed for machine processing and less 
>> for human language convenience, for me, it's easy enough to justify a safe 
>> default in the context of machine processing: `a+b` will not combine the end 
>> of `a` with the start of `b`. You could do this by inserting a ◌ that `b` 
>> could combine with if necessary. That solution would make half of the cases 
>> that I've mentioned work as expected and make the operation always safe, as 
>> far as I can tell.
>> 
>> In that world, it would be a good idea to have a `&+` fallback or something 
>> like that that will let characters combine. I would think that this is a 
>> much less common use case than serializing strings, though.
>> 
>>>> My second concern is with how easy it is to convert an Int to a String 
>>>> index. I've been vocal about this before: I'm concerned that Swift 
>>>> developers will adequate Ints to random-access String iterators, which 
>>>> they are emphatically not. String.Index(100) is proposed as a 
>>>> constant-time operation
>>> 
>>> No, that has not been proposed.  It would be 
>>> 
>>> String.Index(codeUnitOffset: 100)
>>> 
>>> It's hard to strike a balance between keeping programmers from making 
>>> mistakes and making the important use-cases easy.  Do you have any 
>>> suggestions for improving on what we've proposed?
>> 
>> That's still one extension away from String.Index(100), and one function 
>> away from an even more convenient form. I don't have a great solution, but I 
>> don't have a great understanding of the problem that this is solving either. 
>> I'm leaving it here because, AFAIK, Swift 3 imposes constraints that are 
>> hard to ignore and mostly beneficial to people outside of the English 
>> bubble, and it seems that the proposed index regresses on this.
>> 
>> I'm perfectly happy with interchanging indices between the different views 
>> of a String. It's getting the offset in or out of the index that I think 
>> lets people do incorrect assumptions about strings.
> 
> We could have a pair of helper functions to search for the grapheme cluster 
> boundary relative to a given CodeUnit.Index:
> 
> /// Returns the index at the start of the grapheme-cluster containing the 
> given code-unit.
> func indexOfCharacterBoundary(at i: CodeUnits.Index) -> CodeUnits.Index
> 
> /// Returns the index at the start of the grapheme-cluster following the 
> given code-unit.
> func indexOfCharacterBoundary(after i: CodeUnits.Index) -> CodeUnits.Index

What problem does this proposed API solve?

> Actually, if we do forgiving conversion when sharing indexes between String 
> views, it might be nice to expose these explicit index-adjusting functions 
> anyway.
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to