----- Original Message ----- From: "Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, December 10, 2004 8:35 PM
Subject: Re: Nicest UTF



"Philippe Verdy" <[EMAIL PROTECTED]> writes:

The XML/HTML core syntax is defined with fixed behavior of some
individual characters like '&', '<', quotation marks, and with special
behavior for spaces.

The point is: what "characters" mean in this sentence. Code points? Combining character sequences? Something else?

See the XML character model document... XML ignores combining sequences. But for Unicode and for XML a character is an abstract character with a single code allocated in a *finite* repertoire. The repertoire of all possible combining characters sequences is already infinite in Unicode, as well as the number of "default grapheme clusters" they can represent.





Reply via email to