Joe Corneli <[EMAIL PROTECTED]> writes: > Adapted from w3m-filter.el: > > (while (re-search-forward "&#\\([0-9]+\\);" nil t) > (setq ucs (string-to-number (match-string 1))) > (delete-region (match-beginning 0) (match-end 0)) > (insert-char ucs 1)) > > This would appear to work if the characters themselves were recognized... > > But when I run this expression on a buffer containing the string > "玄奘" what I get is an error, like this:
Is that really what w3m does? Hm... well I did doctor it up a bit. In particular, I took out some code that wrapped `ucs' in the last line with the function defined by: (defun w3m-ucs-to-char (codepoint) (or (decode-char 'ucs codepoint) ?~)) But keeping the function around wasn't helping either. Except, when I tried it again, it worked, so I must have gotten something wrong. This code seems a little more readable than the code you supplied... but they seem to have the same effect. Anyway, your advice got me past whatever I was stumbling over. Can you suggest something that will work on this content from the gnu.org homepage? Neither the w3m code nor your code seems to produce human readable output on this stuff (maybe I'm missing some fonts or something?). I get a bunch of control-at characters... (oh yeah, after modifying the "[0-9]" to be ".....". [ [EMAIL PROTECTED] | Bahasa Indonesia | Bosanski | Catal` | 简体中文 | 繁體中文 | Cesky | Dansk | Deutsch | English | Ellynika' | Espaqol | Frangais | Hrvatski | Italiano | E+B+R+J+T+ | 日本語 | 한국어 | Magyar | Nederlands | Norsk | Polski | Portugujs | Rombna | Russkij | Srpski | Shqip | Suomi | Svenska | Tagalog | ภาษาไทย | T|rkge | Tie>'ng Vie>-.t | Ukrayins'ka ] _______________________________________________ Help-gnu-emacs mailing list Help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs