Paul Eggert wrote:
> > I'm not even sure what you
> > mean by "Do not worry about multibyte C locales."
> 
> I meant to not worry about platforms where the "C" (not "C.utf8") locale 
> is multibyte. I don't know of how diffutils would misbehave in such 
> locales (other than not be strictly POSIX-conforming in unusual cases 
> where native tools aren't either), so I wanted Gnulib to not worry about 
> the possibility.

This possibility actually occurs on Android ≥ 5.0. Comments in gnulib/tests/
say:
     On Android ≥ 5.0, the default locale is the "C.UTF-8" locale, not the
     "C" locale.  Furthermore, when you attempt to set the "C" or "POSIX"
     locale via setlocale(), what you get is a "C" locale with UTF-8 encoding,
     that is, effectively the "C.UTF-8" locale.

> > The two functions hard_locale_LC_MESSAGES and hard_locale_LC_TIME
> > look like heuristics to me; I wouldn't bet that they are correct
> > in all situations.
> 
> For what it's worth, GNU Emacs has used a similar heuristic for a decade 
> (see emacs/src/emacs.c's using_utf8) without reported trouble.

For the LC_CTYPE locale category, the code you refer to w.r.t. Emacs and that
you recently added in Gnulib (modules quotearg, propername-lite) looks safe,
because there are only finitely many locale encodings (and none will be
added in the future, hopefully). But for LC_MESSAGES and LC_TIME, there
are some assumptions:
  * hard_locale_LC_MESSAGES assumes that
      - diffutils.pot contains the strings from lib/version-etc.c
        (which are now actually in gnulib.pot),
      - the translator will not translate "(C)" by "(C)",
      - the user does not use LANGUAGE with a precedence list.
  * hard_locale_LC_TIME assumes that no locale, not even the en_US locale,
    uses the same internal format string for "%c" as the C locale.

Bruno






Reply via email to