John Klensin has brought to my attention that it is currently impossible to represent some people's names in PRECIS usernames because some of the relevant Unicode code points are disallowed by the IdentifierClass defined in RFC 7564 (and thus by the UsernameCaseMapped and UsernameCasePreserved profiles defined in RFC 7613).

First, RFC 7564 disallows "default ignorable" code points in the IdentifierClass. However, as I understand it some of these code points are need to represent characters in names that might be desirable to people living within communities that use Indic script and eastern Arabic script (e.g., Persian and writing systems derived from Persian). In particular, the Unicode Standard specifies that ZWJ and ZWNJ are "default ignorable" and it seems that these code points are especially important in this context.

Second, apparently some Chinese family names are typically written (especially outside the People's Republic of China) using characters that the Unicode Consortium assigns to non-BMP code points, or assigns in the BMP but as compatibility decomposable characters (and thus disallowed by RFC 7564 in the IdentifierClass).

I'm not sure whether we can solve these problems (internationalization is messy and we've never tried to guarantee that any particular name or preferred string could be represented in PRECIS usernames), but input from people with a deeper understanding of these issues would be appreciated. I have attempted to reach out to relevant experts, and will report back to this list with any findings.

In the meantime, I plan to submit revised I-Ds addressing other issues with the PRECIS specifications sometime this evening.

Peter

_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Reply via email to