2009/3/25 Robin Bannister <[email protected]>: > Francisco Vila. wrote: >> >> the right googleable word is Unicode, do you agree? > > Well, not fully. When I google for > unicode arabic percent I certainly end > up at a relevant place > http://www.fileformat.info/info/unicode/char/066a/index.htm > But I am not done. I need to collect whatever it is \char needs, so I go > looking for hexadecimals. There are lots of them in a nice table, and they > are not all saying the same thing. This is where "UTF-32" could keep me > straight.
I am now confused because Trevor has said that the hex value is a variable length coding value for the Unicode entity, therefore this hex number has to follow the utf-8 rules, not utf-32 which is always a 32bit fixed-length value. > > Back to NR 3.3.3 >> >> The following example shows UTF-8 coded characters being used > > My main point was: UTF-8 is wrong. > When you criticize UTF-32 as a replacement, are you implying that the next > word "coded" is wrong too? I thought yes, but after Trevor I now think the hex value _is_ utf-8 coded. I might be completely wrong. > If so, I agree. The proper term is Unicode code point (mentioned at the top > of 3.3.3) and it is just an integer - no need to constrain how it is > represented. (But base 16 and the codespace slicing went hand in hand.) > So lets say >> >> The following example shows Unicode code points being used That's what I naïvely called an 'entity' before. > And further up, lets use this same term instead of "Unicode escape > sequence" and "Unicode hexadecimal code" I agree but I remain confused till Trevor throws additional light on this. Does \char accept full hex Unicode code points or rather variable-length utf-8 multibyte characters? -- Francisco Vila. Badajoz (Spain) _______________________________________________ bug-lilypond mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-lilypond
