> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of McDonald, Ira
...
> This becomes even murkier.  W3C _was_ using NFC, as you say, but:

W3C is developing a "W3C normalisation", which is NFC augmented with
considerations of character references (e.g., in XML: &#xhhhh;).


> a)  When the SLP Project (successor to IETF Service Location WG)
>     recently asked for advice about which normalization to use
>     for SLP string compares, Harald Alvestrand -- author of 
>     RFC 2277 "IETF Policy on Character Sets and Languages" and 
>     RFC 3066 "Tags for the Identification of Languages" -- told
>     us to use NFKC (which folds compatibility equivalents into
>     their base characters).  Note that SLP service attributes
>     frequently contain URLs, so this amounts to advice to use
>     NFKC for comparing URLs.

Which seems reasonable (I haven't checked what SLP is), if properly
augmented (Hangul again...).

> b)  The latest "Stringprep Profile for Internationalized Host Names"
>     <draft-ietf-idn-nameprep-07.txt> (9 January 2002)
>     by Paul Hoffman (a Unicode and IETF guru) also uses NFKC.
>     Paul is co-author of RFC 2781 "UTF-16, an encoding of ISO 10646".

        Still, Hoffman did not invent UTF-16.  That was done by people
at the Unicode consortium.  Mark Davis was the editor for the amendment
(as it was originally) to 10646 that describe it on the 10646 side.

>     Note that IDN WG core specs are now in working group 'last call'.

They WERE on "last call".  The comments received will need to be
collected, and acted upon.  The documents were not accepted 'as-is'.
It will take a while before new, revised, documents are available for
a new "last call".  Note that IDN also involves "case folding", so
that domain names remain case insensitive, as well as other mappings
in addition to those of NFKC.


                Kind regards
                /kent k

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to