I'm still waiting to hear from experts with knowledge of Indic and eastern Arabic scripts. However, I have been corresponding offlist with someone who has knowledge of the second issue I raised...
>>> Second, apparently some Chinese family names According to my correspondent, the challenge is representation of some given names, not family names (e.g., legislation in Taiwan stipulates that a given name can include any character that has ever appeared in a dictionary, even dictionaries published hundreds of years ago). >>> are typically >>> written (especially outside the People's Republic of China) >>> using characters that the Unicode Consortium assigns to >>> non-BMP code points John, forgive my ignorance, but it seems to me that the plane is irrelevant here: in PRECIS we base decisions on code point properties. Thus, for instance, any code point whose Unicode general category is "Lo" (other letter) is allowed in the PRECIS IdentifierClass (per Section 9.1 of RFC 7564), regardless of the plane. As an example, a code point like U+2F804 (CJK COMPATIBILITY IDEOGRAPH-2F804) would be allowed, even though it is in the Supplementary Ideographic Plane. >>> or assigns in the BMP but as >>> compatibility decomposable characters (and thus disallowed by >>> RFC 7564 in the IdentifierClass). My correspondent said it should be fine to disallow compatibility decomposable characters such as U+328A (CIRCLED IDEOGRAPH MOON) because according to him they would not be used in given or family names. All of this is second-hand, so take it with a grain of salt. Peter _______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
