To add yet another dimension to what Michael & Asmus & Ken have said:

In a character encoding, the character is *not* the same thing as a text string 
of length 1.

Character identity is defined in theory by a minimal set of entities needed to 
get certain text processes to do the right things ... and in practice by a lot 
of blundering around.

Text/sequence equivalence is defined in specific contexts by specific criteria, 
under various names from "normalization" to "folding" to "spelling".

In that sense

    >The aim of Unicode standardisation is surely to define a single and 
    >unambiguous representation of text.

is well and truly false.  Thus, we can all agree on the letters of the Latin 
alphabet for English, abc...xyz -- but we cannot all agree on a single and 
unambiguous representation of the word "standardization".

Joe

- In the future, they will invent a chicken that runs on gasoline  -- George 
Carlin




Reply via email to