Hi I am considering switching to UTC as the source of our derived IDNA2008 tables, for simple support of Unicode > 12. For Unicode <= 12 this has no difference except for U+19DA which UTC has as PVALID and IANA as DISALLOWED. This means idn2 behaviour changes from:
jas@latte:~$ echo ᧚|idn2 idn2: toAscii: string contains a disallowed character into jas@latte:~/src/libidn2/src$ echo ᧚|./idn2 xn--pkf This actually goes back to libidn2 0.11 behaviour, which also resulted in xn--pkf since it used Unicode < 6.0.0: jas@latte:~/src/libidn2-0.11/src$ ./idn2 --version|head -1 idn2 (idn2) 0.11 jas@latte:~/src/libidn2-0.11/src$ echo ᧚|./idn2 xn--pkf jas@latte:~/src/libidn2-0.11/src$ The xn--pkf output is consistent with some other IDNA2008 implementations: https://icu4c-demos.unicode.org/icu-bin/idnbrowser?t=xn--th5h https://idnaconv.net/try-it.html?encoded=xn--th5h&decode=%3C%3C+Decode There may be other differences between UTC derived values and IANA derived values for Unicode > 12 and <= 15 once IANA gets around to publishing tables, but we can't tell until that happens and I'm not holding my horses since they haven't published anything for 12.1.0 (2019-03), 13.0.0 (2019-11), 14.0.0 (2021) nor 15.0.0 (2022-05). I don't have a strong opinion on this, but some of the factors involved are: 1) consistency with other implementations 2) importance of U+19DA (which is rare) and practical problems resulting from this change (apparently little) 3) support Unicode > 12 now (most important of these factors IMO) 4) domain name stability: once derived for a code point, the property shouldn't change in the future. thus, the change in 0.12 could be considered the bug here. I believe I agreed with the approach used by RFC 6452 at the time it was published, but revisiting this issue today I find myself in the opposite camp. It is a subjective judgement call, and there are good arguments for both sides. If you want to provide feedback on this, please respond here or to this issue: https://gitlab.com/libidn/libidn2/-/issues/112 /Simon
signature.asc
Description: PGP signature