At 12:13 01/10/19 +0900, Soobok Lee wrote: > > And with the existing draft, it does not explain how it going to deal > > with new codepoints in ISO10646 in the future, nor does it explain the > > process to implementing them. The critia here is stablity - if new code > > is added and tables for re-ordering expand, then the algorithm should > > not make existing names invalided.
I think this is an extremely important concern. >REORDERING's mapping occurs within each script block, NOT across script >blocks. >Therefore, new additions of script block won't invalidate or collide >with existing names. Even additions of new rarely used characters in >existing script block won't affect the performance of REORDERING. >In this respect, REORDERING maintains stability over time. For additions of rare characters, in most cases yes. But let's assume that these are characters that are not used in one language (A), but quite frequent in another (B). People using language (B) might then come forward and ask for a different reordering to fit their language better. In general, it's easily possible that the statistics you currently have favor e.g. the major language that uses that script, but disfavor another language. It could e.g. be that the reordering for Arabic makes Arabic names shorter, but makes Farsi or Urdu names longer. Even if there may be many more Arabic than Farsi or Urdu names, this would be quite a bit unfair. And we just don't have enough data at the moment to be able to say this is not the case. >Current IDNA/nameprep does not prohibit, but discourage including >unassigned code points in legal IDN labels, because new normalization/case >mappings >would be defined on them in the future. some ACE labels including unsigned >code block (tagalog?) might be proven invalid in the future. Nameprep/NFKC >Versioning tag schems using new ACE prefix will be needed in the future, i >guess. Yes. But for the majority of really useful characters, in old and new scripts, it's rather obvious that they will be allowed. On the other hand, it's totally unclear how to reorder them. Also, in case of some implementation mistake in Nameprep/NFKC, in most cases, it will just make a few names unusable, but not affect the rest. For reordering, a bug will completely confuse a whole script. Also, now we have a testbed, and you just think that the testbed is representative. But once IDN is running, to run a testbed for a new script will be difficult, because we need the testbed data for the reordering statistics, but we need the reordering for the testbed. >Therefore, REORDERING follows the same IDNA/nameprep recommendations on >the issues >of new SCTIPT blocks/unsigned code points. > >This statements will be included in the final version of REORDERING I-D. Well, in theory, this works. But I think it's just too unstable to work in practice. Regards, Martin.
