UTF-8 should be the right format for doc-base files, according to
https://lintian.debian.org/tags/doc-base-file-uses-obsolete-national-encoding.html

I also don't know ruby, but from my research setting Encoding.default_external
is considered the "wrong" thing to do, the "right" way being to pass "-E
UTF-8" as an option to ruby via the command line, or the environment
variable RUBYOPT. I had to explicitly silence a warning because of this.
See
http://docs.ruby-lang.org/en/2.1.0/Encoding.html#method-c-default_external-3D

However, neither of those "right" ways to set the encoding work well with
using a ruby file directly as a script. (Is ruby not intended to be used in
scripts?!) In the ruby docs, it says the problem is if code gets run before
the change to the encoding. That's avoidable, and I believe I avoided it in
my patch by placing the encoding change before any require imports.

An alternative is to explicitly set the encoding to UTF-8 each time a file
is opened. If someone feels that's a better way, I'm willing to do that and
create a new patch. But like I said, I don't know ruby, so I can't
guarantee correctness beyond trying it and seeing that it works.

- Dan

On Sun, Dec 7, 2014 at 2:06 PM, gregor herrmann <gre...@debian.org> wrote:

> Control: tag -1 - moreinfo
> Control: tag -1 + confirmed
>
> On Sat, 06 Dec 2014 01:33:58 -0100, Daniel Getz wrote:
>
> I can reproduce the problem with
> LC_ALL=C LANG=C /etc/cron.weekly/dhelp
>
> > Attached is a diff with a change to dhelp_parse.rb which sets
> > Encoding.default_external explicitly, so that even if LANG=C, it uses
> UTF-8
> > instead of US-ASCII as the default for opening files. By my (limited)
> > understanding of Encoding.default_external, this should have the same
> > effect on opening files as replacing LANG=C with LANG=xx_XX.UTF-8 would.
> >
> > On my machine, without the patch, I see the same errors with LANG=C as
> the
> > others here. With the patch, I do not.
>
> Works for me as well.
>
>
> Since I don't speak any ruby I'm a bit hesitant to upload; maybe some
> ruby speaker knowing Encoding.default_external can confirm that's the
> correct way forwards?
>
> (And: Are we sure all doc-base files are us-ascii or utf-8 encoded?
> At least on my machine they are, so maybe that's a non-concern.)
>
>
> Cheers,
> gregor
>
> --
>  .''`.  Homepage: http://info.comodo.priv.at/ - OpenPGP key
> 0xBB3A68018649AA06
>  : :' : Debian GNU/Linux user, admin, and developer  -
> http://www.debian.org/
>  `. `'  Member of VIBE!AT & SPI, fellow of the Free Software Foundation
> Europe
>    `-   NP: Treibhaus: Yellowman Jamaica
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQJ8BAEBCgBmBQJUhGzpXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w
> ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXREMUUxMzE2RTkzQTc2MEE4MTA0RDg1RkFC
> QjNBNjgwMTg2NDlBQTA2AAoJELs6aAGGSaoG/LMP/2o9yR4MuLwI+uxzEq0sgiPW
> wz5K4/+98llYpEnHrcEzWIp5sdJF3NkMqEr8eqtycOUUdLismSp3MeH7DByxQX9H
> to/qFXpwM+qTf6dLiNrQykQzkBI+kTg7SszslTIdNbrOqSDR9UGOSZs2IX3OoKac
> N/651M1MfPz6EuyVehUEeLchUJWaiqz+XpLblV10FjnH8UxUzeMg6Dck7bYpGAuT
> +PLfNrurXx1ldoCkoqaCwCzBbKb0ZBu8A0AzdfgWUeudXwmgIF+u0Fs0rQMqUifS
> +QfcS0lMFAxBTBIimDogoyteLhxgE9OaNGqizZv2/xQPPvXOTrzF7BlKSr5SLWw0
> A73YqAhrzU0Rxawl6i7+eKyEYUt59Cc7mJWAKCJ8o10QipDid90GPAJ78Rmjxo8W
> aWb/zGu/DJ70e+D1WEZ+VEwDQs6LgpibY10cjkLOH813b62DahDh9vuHIgvIc7Xa
> 3naQRh626lAmpxdCqqDobxMa3o8M2tcbqrIFrQRq69VarW2eDXJVT/MoCUy+vjCS
> Qu5t5vCX+qONuxYnGUAiHsnk7eSGh52EOUtaXjYFvqUA6YWFkSfy0+apaFD1nlj9
> H93c1xAFfDFbE4Aue9oxIenIVXMEH/KtPqYikt0ApHH/IcYiMDc3nGNhUUL4Nvyc
> WuWu7s3lZpbMnI0Cgzly
> =pVVw
> -----END PGP SIGNATURE-----
>
>

Reply via email to