On Sun, Feb 12, 2017 at 12:27 PM, Peter Saint-Andre <stpe...@stpeter.im> wrote:
> Did you mean U+212A (KELVIN SIGN)? That decomposes to U+004B (LATIN CAPITAL
> LETTER K).
>
>> The full example is:
>> "\U0001f11aevin" => "(K)evin" => "(k)evin"

I'm talking about 'PARENTHESIZED LATIN CAPITAL LETTER K' (U+1F11A).
Sorry it's not clear that the A is part of the unicode escape.

With casefold or tolower, the result is the same for these Nicknames:

Not idempotent: "\U0001f11A" => "(K)" => "(k)"
Not idempotent: "\U0001f13A" => "K" => "k"
Not idempotent: "\u210c" => "H" => "h"
Not idempotent: "\u210d" => "H" => "h"
Not idempotent: "\u20a8" => "Rs" => "rs"

When you apply the comparison steps from RFC 7700, Section 2.4, you
still get something that is upper case. If you apply the comparison
steps again, you now get lower case.

>> I wrote a program to categorize characters that are not idempotent
>> under Nickname "ToLower" (ignoring white space). The numbers are the
>> same for Unicode 6.3, 8.0 and 9.0.
>>
>> {
>>   '<font>': 467,
>>   '<square>': 90,
>>   '<compat>': 35,
>>   '<super>': 27,
>>   '<circle>': 4
>> }
>
>
> Would you mind sending me your list of characters?

I will send it to you in a separate email.

Thanks,
Bill

_______________________________________________
precis mailing list
precis@ietf.org
https://www.ietf.org/mailman/listinfo/precis

Reply via email to