On 2/12/17 8:15 PM, Florian Zeitz wrote:
> Am 13.02.2017 um 00:29 schrieb Peter Saint-Andre:
>> John Klensin has brought to my attention that it is currently impossible
>> to represent some people's names in PRECIS usernames because some of the
>> relevant Unicode code points are disallowed by the IdentifierClass
>> defined in RFC 7564 (and thus by the UsernameCaseMapped and
>> UsernameCasePreserved profiles defined in RFC 7613).
>>
>> First, RFC 7564 disallows "default ignorable" code points in the
>> IdentifierClass. However, as I understand it some of these code points
>> are need to represent characters in names that might be desirable to
>> people living within communities that use Indic script and eastern
>> Arabic script (e.g., Persian and writing systems derived from Persian).
>> In particular, the Unicode Standard specifies that ZWJ and ZWNJ are
>> "default ignorable" and it seems that these code points are especially
>> important in this context.
>>
> I'd have to look at it in more detail, but that assessment seems wrong
> to me.
> Algorithmically we check for JoinControl before
> PrecisIgnorableProperties, making ZWJ and ZWNJ CONTEXTJ.
> That allows them to occur after virama and where they break a cursive
> connections. 

Correct.

> I'm not sure those are the only cases that John is
> concerned about, but they are not generally disallowed as I understand it.

Because I don't have a good understanding of the relevant scripts and
languages, I am dependent on people who do, such as Nalini Elkins. I
shall ping her again to see if she can share her insights.

> That said, I always found it a bit unsettling that it is virtually
> impossible to determine the algorithmic result from the textual
> description of what is and isn't allowed.

There is much that is unsettling about internationalization. We do the
best we can with what is given.

Peter

_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Reply via email to