Joe Corneli <[EMAIL PROTECTED]> writes: > Adapted from w3m-filter.el: > > (while (re-search-forward "&#\\([0-9]+\\);" nil t) > (setq ucs (string-to-number (match-string 1))) > (delete-region (match-beginning 0) (match-end 0)) > (insert-char ucs 1)) > > This would appear to work if the characters themselves were recognized... > > But when I run this expression on a buffer containing the string > "玄奘" what I get is an error, like this:
Is that really what w3m does? I'm not sure how the above could possibly work in any normal version of Emacs -- the argument to `insert-char' is an Emacs characater, not a unicode code-point. So, you need to translate from the unicode code-point to the Emacs character encoding. One method might be to translate the unicode code-point into a utf-16 string (should be trivial I guess), and then use `decode-coding-string' to translate that into Emacs' internal encoding; e.g.: (while (re-search-forward "&#\\([0-9]+\\);" nil t) (let* ((ucs (string-to-number (match-string 1))) (ucs-string (string (logand ucs #xFF) (logand (ash ucs -8) #xFF))) (decoded-string (decode-coding-string ucs-string 'mule-utf-16le))) (delete-region (match-beginning 0) (match-end 0)) (insert decoded-string))) For me, this does the right thing on your example, and on the text of that wikipedia page: The fictional character Xuanzang ($B8<Ty(B, WG: Hs.AN|an-tsang), a central character of the classic Chinese novel Journey to the West ... It probably will only work well in recent CVS versions of Emacs that have `utf-translate-cjk-mode' turned on by default though. [*] -Miles [*] In the current CVS Emacs, there seems to be a function that does this translation directly too, `utf-lookup-subst-table-for-decode' but given the odd name, it's probably not intended for general use... -- Love is a snowmobile racing across the tundra. Suddenly it flips over, pinning you underneath. At night the ice weasels come. --Nietzsche _______________________________________________ Help-gnu-emacs mailing list Help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs