On Sat, Sep 05, Vincent Lefevre wrote: > On 2015-09-02 16:50:21 +0200, Olaf Hering wrote: > > After some debugging it turned out that mutt has a bug: > > LC_ALL=C w3m -dump doc/manual.html > bad.txt > > LC_ALL=C.UTF-8 w3m -dump doc/manual.html > good.txt > > > > I suggest to force UTF-8 instead of plain ASCII. > > As one shouldn't change the locales except by setting LC_ALL=C > (C.UTF-8 is unfortunately not standard and broken when used with > glibc[*]), this would mean using a tool that can transform XML to > text in a way that does not depend on the locales (e.g. something > based on XSLT). Or stick with ASCII (but do not use w3m, which > cannot transcode non-ASCII characters).
Why should one not change locales within a program which converts file a to file b?! The current way of creating docs during buildtime is broken due to the LC_ALL=C enforcement. This has to be fixed in the mutt sources, perhaps by enforcing "LC_ALL=en_US.UTF-8" Olaf
