Attached is a diff with a change to dhelp_parse.rb which sets Encoding.default_external explicitly, so that even if LANG=C, it uses UTF-8 instead of US-ASCII as the default for opening files. By my (limited) understanding of Encoding.default_external, this should have the same effect on opening files as replacing LANG=C with LANG=xx_XX.UTF-8 would.
On my machine, without the patch, I see the same errors with LANG=C as the others here. With the patch, I do not. Hope to help, - Dan Getz
diff -Nru dhelp-0.6.21+nmu5/debian/changelog dhelp-0.6.21+nmu6/debian/changelog --- dhelp-0.6.21+nmu5/debian/changelog 2014-10-15 06:35:28.000000000 -0100 +++ dhelp-0.6.21+nmu6/debian/changelog 2014-12-06 01:05:28.000000000 -0100 @@ -1,3 +1,10 @@ +dhelp (0.6.21+nmu6) UNRELEASED; urgency=medium + + * Non-maintainer upload. + * Load files as UTF-8, regardless of $LANG + + -- Dan Getz <tank...@gmail.com> Sat, 06 Dec 2014 00:41:01 -0100 + dhelp (0.6.21+nmu5) unstable; urgency=medium * Non-maintainer upload. diff -Nru dhelp-0.6.21+nmu5/src/dhelp_parse.rb dhelp-0.6.21+nmu6/src/dhelp_parse.rb --- dhelp-0.6.21+nmu5/src/dhelp_parse.rb 2014-10-15 06:12:27.000000000 -0100 +++ dhelp-0.6.21+nmu6/src/dhelp_parse.rb 2014-12-06 01:05:04.000000000 -0100 @@ -24,6 +24,11 @@ PREFIX = '/usr' DEFAULT_INDEX_ROOT = "#{PREFIX}/share/doc/HTML" +# Set default file format as UTF-8, without printing a warning +old_verbose, $VERBOSE = $VERBOSE, false +Encoding.default_external = "UTF-8" +$VERBOSE = old_verbose + require 'dhelp' require 'dhelp/exporter/html' include Dhelp