On Mon 06 Jun 2011, Arthur Reutenauer wrote:

> > Well, there *is* more than one way to represent รค in UTF-8
> If you mean "non-shortest" forms such as 0xE0 0x83 0xA4 or 0xF0 0x80
> 0x83 0xA4, then no, they have been forbidden since Unicode 3 in 2000
> (formally Corrigendum #1, see
> http://www.unicode.org/versions/corrigendum1.html).

I was actually thinking of precomposed vs. combining diacritics. I was
blissfully unaware of the non-shortest-form problem up until now...

If your question is of interest to others as well, please add an entry to the 

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net

Reply via email to