Lars Marius Garshol scripsit: > - will string comparison methods based on NFC and NFD always give the > same results?
By intention, yes. > - is it correct that methods based on NFKC and NFKD will give > different results from ones based on NFC/NFD? Yes. > - if NFC and NFD give the same results, why are both specified? Why > would an implementation choose one over the other? Originally, only NFD was given, as it is sufficient. However, text converted from non-Unicode encodings is generally already in NFC, so specifying NFC (which is conceptually NFD with a post-processing pass to re-create certain precomposed characters) has certain practical advantages. In particular, if you are doing "early normalization", near the point of creation, then NFC allows easy step-down to non-Unicode encodings. > - NFKC/NFKD seem to lose significant information; in what contexts > are they intended to be used? Compatibility distinctions may or may not be important in particular cases: often they represent distinctions that are merely historical. One context where compatibility distinctions are typically unimportant is in identifiers. -- John Cowan <[EMAIL PROTECTED]> http://www.reutershealth.com I amar prestar aen, han mathon ne nen, http://www.ccil.org/~cowan han mathon ne chae, a han noston ne 'wilith. --Galadriel, _LOTR:FOTR_

