Is there no better way to translate manual.xml into manual.txt beside
using a webbrowser to dump manual.html? Perhaps there is none.

The current state for me is oddly looking output. Example:

...
7. Forwarding and Bouncing Mail
...
function bound to ?b? and ?f? respectively.
...

xxd says:
00017db0: 3c62 6f75 6e63 653e 2066 756e 6374 696f  <bounce> functio
00017dc0: 6e20 616e 6420 666f 7277 6172 6469 6e67  n and forwarding
00017dd0: 2075 7369 6e67 2074 6865 203c 666f 7277   using the <forw
00017de0: 6172 643e 0a66 756e 6374 696f 6e20 626f  ard>.function bo
00017df0: 756e 6420 746f 20e2 809c 62e2 809d 2061  und to ...b... a
00017e00: 6e64 20e2 809c 66e2 809d 2072 6573 7065  nd ...f... respe
00017e10: 6374 6976 656c 792e 0a0a 466f 7277 6172  ctively...Forwar
00017e20: 6469 6e67 2063 616e 2062 6520 646f 6e65  ding can be done

Is 0xe2 0x80 0x9c valid UTF-8 for '“'? Appearently it is, because thats
what Firefox gives with copy&paste, and its looking fine in this vim
session. So that means that less(1) and even vim(1) is unable to cope
with manual.txt. Is there perhaps a mix of encodings in manual.txt that
confuses the pager?! Does it fail just for me?

...

After some debugging it turned out that mutt has a bug:
LC_ALL=C w3m -dump doc/manual.html > bad.txt
LC_ALL=C.UTF-8 w3m -dump doc/manual.html > good.txt

I suggest to force UTF-8 instead of plain ASCII.


Olaf

Reply via email to