Mark Davis/Cupertino/IBM wrote: > One evening at the recent W3C i18n meeting in Seattle, I wrote a program to > generate data files that contain the differences between the XML > identifiers, the Unicode recommended identifiers, and nameprep. I put the > results on: http://www.macchiato.com/unicode/IdentifierDiff.txt > > This is only informational, to get an idea of how the three of them differ. > I tried to segment the differences in a meaningful way within the file. In > so doing, I also generated a data file that shows when characters came into > Unicode. It is at http://www.macchiato.com/unicode/CharacterAge.txt. > > For Nameprep I actually used the canonical closure, where the canonical > closure of X is the set of all characters that are canonically equivalent > to a sequence of one or more characters from X. > > For both of these, if you view as UTF-8 you can see the characters as well > as the names and code points. > > Mark > ___ > Mark Davis, IBM Center for Java Technology, Cupertino > (408) 777-5850 [fax: 5891], [EMAIL PROTECTED], [EMAIL PROTECTED] > http://maps.yahoo.com/py/maps.py?Pyt=Tmap&addr=10275+N.+De+Anza&csz=95014
