Hello, Doug! I)
AT> http://www.unicode.org/unicode/uni2book/ch03.pdf AT> 1. AT> - A single abstract character may correspond to more then one code AT> value - for example, U+00C5 ... LATIN CAPITAL LETTER A WITH RING and U+212B ... ANGSTROM SIGN 2. AT> - Multiple code values may be required to represent a single abstract AT> character. DE> I don't see a discrepancy between these two statements, at least not one DE> that the phrase "more than one code value sequence" would clarify. Yes, _this_ is the fragement that looks confusing to me. 2. says that a single abstaract character may need more then one code value to be encoded. Okay, this is about surrogate pairs. 1. speaks about a single abstract character mapping to two _scalar values_ But then it should have said "A single abstract charcter may correspond to more then one SEQUENCE of 1 to 2 code values!! Imagine an abstract character corresponds to two scalar values over 0xFFFF. Then it corresponds to two PAIRS OF CODE VALUES, not to two CODE VALUES Dough? --- II) AT> For example, a byte is the code unit in SJIS:... AT> ideographs require two code values DE> I do think the text here is unclear about "code values" and "code DE> units." Doug, I did not mean to go that far :-) DE> <http://www.unicode.org/unicode/reports/tr17/> between "code point" and DE> "code unit." Thanks for the link! DE> A code point ... U+0410 DE> Code units are the two bytes 0xD0 0x90 required to express DE> that code point in UTF-8, or the single 32-bit word 0x00000410 required DE> to express it in UTF-32. DE> Incorporating the concepts from UTR #17 into the main text is one place DE> where the "language tightening" project for Unicode 4.0 should really DE> pay off. It looks to me that both concepts are already in ch03.pdf A code value is also referred to as a code unit in the information industry A Unicode scalar value is also referred to as a code position or a code point in the information industry Sure "language tightening" will be good, but this was not the part of ch03.pdf that got me confused. I personally am quite content with the - code value, code unit - code point, scalar value, code position definitions :-) - Anton

