I'm working on a specification for a data model and would like to check that my definition of the string type makes sense.
The definition currently says: <dt>String</dt> <dd><p>Strings are sequences of Unicode code points conforming to Unicode Normalization Form C <xref to="unicode"/>.</p> <p>Strings are equal if they consist of the exact same sequence of abstract Unicode characters. This implies that all comparisons are case-sensitive.</p> Does this make sense? Is "code point" the right term, or should I say "scalar value"? And what about "abstract character"? Are two equal sequences of code points in NFC necessarily composed of the same sequence of abstract characters? Thanks for any help! -- Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net > GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >

