Joe Corneli <jcorneli <at> math.utexas.edu> writes: > (defun w3m-ucs-to-char (codepoint) > (or (decode-char 'ucs codepoint) ?~)) > > But keeping the function around wasn't helping either. Except, when I > tried it again, it worked, so I must have gotten something wrong. > > This code seems a little more readable than the code you > supplied... but they seem to have the same effect.
Hmmm, I missed that; yeah, `decode-char' does look much nicer ... :-) > Can you suggest something that will work on this content from the > gnu.org homepage? Neither the w3m code nor your code seems to produce > human readable output on this stuff (maybe I'm missing some fonts or > something?). I get a bunch of control-at characters... (oh yeah, > after modifying the "[0-9]" to be ".....". > > [ Az <at> rbaycanca | Bahasa Indonesia | Bosanski | Catal` > | 简体中文 | > 繁體中文 | Cesky | Dansk | Presumably the "x" following &# means "hex", so you should use the BASE argument to string-to-number if you see it. The following tweak to your original code seems to generate reasonable output: (while (re-search-forward "&#\\(x\\)?\\([0-9a-f]+\\);" nil t) (let ((ucs (string-to-number (match-string 2) (if (match-beginning 1) 16 10)))) (delete-region (match-beginning 0) (match-end 0)) (insert-char (decode-char 'ucs ucs) 1))) [The trick to select decimal or hex works because `match-beginning' returns nil for optional parenthesized expressions which didn't match.] -Miles _______________________________________________ Help-gnu-emacs mailing list Help-gnu-emacs@gnu.org http://lists.gnu.org/mailman/listinfo/help-gnu-emacs