On Mar 16, 2005, at 12:19 PM, Stefan Monnier wrote:
I apologize for the "retro" question, but I was wondering if there was an
easy way to convert a character in the Emacs-20 internal 19-bit encoding
(from FAST_GLYPH_CHAR(glyph)) to UTF-8 (preferable) or straight Unicode.
I'd like to do it fully within C if possible, and it needs to be efficient.
I found a way to do this using parts of the C program available at:
http://tclab.kaist.ac.kr/~otfried/Mule/
Basically it uses a large table to convert from charset/byte1/byte2 to unicode then UTF-8. I call SPLIT_NON_ASCII_CHAR() to get that info out of the 19-bit internal representation stored in the glyph. CCL was not needed, though maybe it would have provided a more compact way to solve the problem than a 250K table.
However, I still have an issue: for 2-byte characters, such as Big5 or JIS Chinese characters, emacs (20) is giving me two glyphs for each character, with identical values. Does this have something to do with it thinking the font needs a double wide horizontal space to render the character?
thanks, Adrian
_______________________________________________ Emacs-devel mailing list Emacs-devel@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-devel