On 2/12/17 8:15 PM, Florian Zeitz wrote: > Am 13.02.2017 um 00:29 schrieb Peter Saint-Andre: >> John Klensin has brought to my attention that it is currently impossible >> to represent some people's names in PRECIS usernames because some of the >> relevant Unicode code points are disallowed by the IdentifierClass >> defined in RFC 7564 (and thus by the UsernameCaseMapped and >> UsernameCasePreserved profiles defined in RFC 7613). >> >> First, RFC 7564 disallows "default ignorable" code points in the >> IdentifierClass. However, as I understand it some of these code points >> are need to represent characters in names that might be desirable to >> people living within communities that use Indic script and eastern >> Arabic script (e.g., Persian and writing systems derived from Persian). >> In particular, the Unicode Standard specifies that ZWJ and ZWNJ are >> "default ignorable" and it seems that these code points are especially >> important in this context. >> > I'd have to look at it in more detail, but that assessment seems wrong > to me. > Algorithmically we check for JoinControl before > PrecisIgnorableProperties, making ZWJ and ZWNJ CONTEXTJ. > That allows them to occur after virama and where they break a cursive > connections.
Correct. > I'm not sure those are the only cases that John is > concerned about, but they are not generally disallowed as I understand it. Because I don't have a good understanding of the relevant scripts and languages, I am dependent on people who do, such as Nalini Elkins. I shall ping her again to see if she can share her insights. > That said, I always found it a bit unsettling that it is virtually > impossible to determine the algorithmic result from the textual > description of what is and isn't allowed. There is much that is unsettling about internationalization. We do the best we can with what is given. Peter _______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
