Bug#1013946: lintian: wrongly report unknown-locale-code ber

Axel Beckert Mon, 27 Jun 2022 17:09:17 -0700

Control: tag -1 + help

Hi Russ,

Russ Allbery wrote:
> > But upon deeper inspection I found that this is likely not an issue in
> > iso-codes as "ber" is correctly not in
> > /usr/share/iso-codes/json/iso_639-3.json but in …/iso_639-2.json and
> > …/iso_639-5.json as it is a code for a language group. (Which kinda
> > makes it suspicious for me to be used in locales. But then again I'm
> > not a linguist.)
> 
> Sorry, I followed up on the bug and forgot to explicitly cc Lintian

Not needed. I got the message via the lintian ML / maintainer address.
(Somehow I though didn't get my own messages to that bug report back
via the list.)

> I worked out the same thing, and I'm fairly sure that means that this is
> not a valid locale.  It's the code for the Berber language *group*, and
> the individual members of that group have their own 639-3 codes, so that
> seems to imply to me that those translations were tagged with the wrong
> code.

Yep, I also noticed that. I'm just not sure where exactly the border
between just a group of languages, which has no common grounds to be
spoken anywhere, and a group of very similar languages, which likely
can be understood by members of another language from the same group
and maybe even have a common written language, is.

Toddy may indeed have some more input for us here.

> Fabio also followed up and noted that there are a few translations for ber
> in Launchpad, but they're all partial and probably not usable.

Ok, I didn't get that mail. So maybe I really didn't get your initial
mail, just another mail from you to the bug report. :-)

> Tobias probably knows more, as iso-codes maintainer, but my guess is that
> this is a mistake on the Launchpad side and those translations should be
> for one of the specific languages of the group rather than being coded to
> the 639-5 language group code.  I think Lintian should still continue to
> use 639-3.
> 
> That said, I'll leave it to you to decide if you want to hang on to the
> bug or not.  :)

Thanks for your input here. Actually that variant so far was my second
choice (the stricter one) so far. See the very end of that one long
mail from me. :-)

Anyway, JFTR: I just looked at how lintian in Debian Stable (i.e.
2.104.0 in Bullseye) does the locale code lookup. It had it's own data
file for that (and hence now using iso-codes is good as it is no more
duplicating these 33kB of data) and that file
(/usr/share/lintian/data/files/locale-codes) states:

  # List of locale codes.  This is derived from the ISO 639-1, ISO
  # 639-2, and ISO 639-3 standards.

And indeed, "ber" was in that file.

So previously lintian did use ISO 639-1, 639-2 and 639-3.

So using just ISO 639-3 was either an accident, on purpose or a
regression and has been introduced when lintian was switching to
iso-code's files as data source in commit
https://salsa.debian.org/lintian/lintian/-/commit/fcaded19

Unfortunately this commit was tagged "Gbp-Dch: ignore" in git
(why?!?), so it didn't appear in debian/changelog. *grrrr* (I may
retroactively add it to the debian/changelog entry of 2.115.0 like I
already added the item about switching to Text::Glob which also caused
bugs.)

Anyway, with you proposing a more strict checking here and I was at
least initially proposing to get back to the more laxer parsing used
previously, it would be really good to have some additionaly input
from someone with a bit more experience on that topic. I hope that
Toddy can provide that. :-)

Tagging as help for that reason.

                Regards, Axel
-- 
 ,''`.  |  Axel Beckert <a...@debian.org>, https://people.debian.org/~abe/
: :' :  |  Debian Developer, ftp.ch.debian.org Admin
`. `'   |  4096R: 2517 B724 C5F6 CA99 5329  6E61 2FF9 CD59 6126 16B5
  `-    |  1024D: F067 EA27 26B9 C3FC 1486  202E C09E 1D89 9593 0EDE

Bug#1013946: lintian: wrongly report unknown-locale-code ber

Reply via email to