"Nick Sabalausky" <[email protected]> wrote in message news:[email protected]... > "Andrei Alexandrescu" <[email protected]> wrote in message > news:[email protected]... >> On 1/13/11 10:26 PM, Nick Sabalausky wrote: >> [snip] >>> [ 'f', {u with the umlaut}, 'n', 'f' ] >>> >>> Or: >>> >>> [ 'f', 'u', {umlaut combining character}, 'n', 'f' ] >>> >>> Those *both* get rendered exactly the same, and both represent the same >>> four-letter sequence. In the second example, the 'u' and the {umlaut >>> combining character} combine to form one grapheme. The f's and n's just >>> happen to be single-code-point graphemes. >>> >>> Note that while some characters exist in pre-combined form (such as the >>> {u >>> with the umlaut} above), legend has it there are others than can only be >>> represented using a combining character. >>> >>> It's also my understanding, though I'm not certain, that sometimes >>> multiple >>> combining characters can be used together on the same "root" character. >> >> Thanks. One further question is: in the above example with u-with-umlaut, >> there is one code point that corresponds to the entire combination. Are >> there combinations that do not have a unique code point? >> > > My understanding is "yes". At least that's what I've heard, and I've never > heard any claims of "no". I don't know of any specific ones offhand, > though. Actually, it might be possible to use any combining character with > any old letter or number (like maybe a 7 with an umlaut), though I'm not > certain. > > FWIW, the Wikipedia article might help, or at least link to other things > that might help: http://en.wikipedia.org/wiki/Combining_character > > Michel or spir might have better links though. >
Heh, as if that wasn't bad enough, there's also digraphs which, from what I can tell, seem to be single code-points that represent more than one glyph/character/grapheme: http://en.wikipedia.org/wiki/Digraph_(orthography)#Digraphs_in_Unicode This page may be helpful too: http://en.wikipedia.org/wiki/Precomposed_character
