> On 23 Jan 2017, at 06:54, Félix Cloutier via swift-evolution
> <[email protected]> wrote:
>
>
>>> doesn't necessarily mean that ignoring that case is the right thing to do.
>>> In fact, it means that Unicode won't do anything to protect programs
>>> against these, and if Swift doesn't, chances are that no one will. Isolated
>>> combining characters break a number of expectations that developers could
>>> reasonably have:
>>>
>>> (a + b).count == a.count + b.count
>>> (a + b).startsWith(a)
>>> (a + b).endsWith(b)
>>> (a + b).find(a) // or .find(b)
>>>
>>> Of course, this can be documented, but people want easy, and documentation
>>> is hard.
>>
>> Yes. Unfortunately they also want the ability to append a string
>> consisiting of a combining character to another string and have it append.
>> And they don't want to be prevented from forming valid-but-defective Unicode
>> strings.
>>
>> […]
>>
>> Can you suggest an alternative that doesn't violate the Unicode standard and
>> supports the expected use-cases, somehow?
>
>
> I'm not sure I understand. Did we go from "this is a degenerate/defective
> <https://github.com/apple/swift/blob/master/docs/StringManifesto.md#string-should-be-a-collection-of-characters-again>
> case that we shouldn't bother with" to "this is a supported use case that
> needs to work as-is"? I've never seen anyone start a string with a combining
> character on purpose, though I'm familiar with just one natural language that
> needs combining characters. I can imagine that it could be a convenient
> feature in other natural languages.
>
> However, if Swift Strings are now designed for machine processing and less
> for human language convenience, for me, it's easy enough to justify a safe
> default in the context of machine processing: `a+b` will not combine the end
> of `a` with the start of `b`. You could do this by inserting a ◌ that `b`
> could combine with if necessary. That solution would make half of the cases
> that I've mentioned work as expected and make the operation always safe, as
> far as I can tell.
>
> In that world, it would be a good idea to have a `&+` fallback or something
> like that that will let characters combine. I would think that this is a much
> less common use case than serializing strings, though.
>
>>> My second concern is with how easy it is to convert an Int to a String
>>> index. I've been vocal about this before: I'm concerned that Swift
>>> developers will adequate Ints to random-access String iterators, which they
>>> are emphatically not. String.Index(100) is proposed as a constant-time
>>> operation
>>
>> No, that has not been proposed. It would be
>>
>> String.Index(codeUnitOffset: 100)
>>
>> It's hard to strike a balance between keeping programmers from making
>> mistakes and making the important use-cases easy. Do you have any
>> suggestions for improving on what we've proposed?
>
> That's still one extension away from String.Index(100), and one function away
> from an even more convenient form. I don't have a great solution, but I don't
> have a great understanding of the problem that this is solving either. I'm
> leaving it here because, AFAIK, Swift 3 imposes constraints that are hard to
> ignore and mostly beneficial to people outside of the English bubble, and it
> seems that the proposed index regresses on this.
>
> I'm perfectly happy with interchanging indices between the different views of
> a String. It's getting the offset in or out of the index that I think lets
> people do incorrect assumptions about strings.
We could have a pair of helper functions to search for the grapheme cluster
boundary relative to a given CodeUnit.Index:
/// Returns the index at the start of the grapheme-cluster containing the given
code-unit.
func indexOfCharacterBoundary(at i: CodeUnits.Index) -> CodeUnits.Index
/// Returns the index at the start of the grapheme-cluster following the given
code-unit.
func indexOfCharacterBoundary(after i: CodeUnits.Index) -> CodeUnits.Index
Actually, if we do forgiving conversion when sharing indexes between String
views, it might be nice to expose these explicit index-adjusting functions
anyway.
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution