----- Original Message ----- From: "Marcin 'Qrczak' Kowalczyk" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, December 10, 2004 8:35 PM
Subject: Re: Nicest UTF
"Philippe Verdy" <[EMAIL PROTECTED]> writes:
The XML/HTML core syntax is defined with fixed behavior of some individual characters like '&', '<', quotation marks, and with special behavior for spaces.
The point is: what "characters" mean in this sentence. Code points? Combining character sequences? Something else?
See the XML character model document... XML ignores combining sequences. But for Unicode and for XML a character is an abstract character with a single code allocated in a *finite* repertoire. The repertoire of all possible combining characters sequences is already infinite in Unicode, as well as the number of "default grapheme clusters" they can represent.

