Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-10 Thread Aurelien Jarno
On 2022-01-09 18:12, Simon McVittie wrote:
> On Sun, 09 Jan 2022 at 13:48:06 +0100, Aurelien Jarno wrote:
> > On 2022-01-06 11:21, Simon McVittie wrote:
> > > * install locales-all (this costs > 200M but ensures that all locales are
> > >   available)
> > > 
> > > For "reasonably large" desktop and server systems, I wonder whether it
> > > might be better to generate a subset of locales-all with just the UTF-8
> > > locales that we recommend for general use, and install that by default?
> > 
> > Defining general use is something quite difficult. All languages and
> > countries should be considered equally, so we could differentiate
> > UTF-8 from non UTF-8 locales, but we should not make further selection.
> 
> Right, what I meant was: AIUI we recommend that all speakers of xx_YY
> use the xx_YY.utf8 locale, as opposed to a legacy national encoding, so
> we could (make it straightforward to) install all the UTF-8 locales
> like en_GB.utf8 and none of the legacy national encodings like
> en_GB.ISO-8859-15.
> 
> > That way of doing it would be fine from the desktop point of view (100M
> > is not that much compared to a desktop environment). However we can't
> > force the installation of locales-all-utf8 in d-i
> 
> I thought task-*-desktop could maybe pull it in?

Agree for the desktop case. But that's not possible for other type of
installations, so that doesn't provide a way to remove the "legacy"
locales package.

As said in my previous mail, if we start reorganizing the locales-all
package with the eventual goal of dropping the locales package
(something I would like to do for many years), we should ensure that it
works for all cases.

> > From the various discussion on IRC, we more or less concluded that the
> > way to go is to have one locale package per language, like it's done in
> > most other distributions. From there we could have task-$language
> > depends on locales-$language, also simplifying the d-i side.
> > 
> > Would that work for your use case?
> 
> That would mean that UIs like gnome-control-center would still not be able
> to offer to add (for example) a French locale on a system that had been
> installed in German, unless either the user knows that they need to install
> the French language pack first, or the UI grows distro-specific code to:
> 
> - know which languages would be candidates for being enabled if the
>   appropriate language pack was installed
> - ask PackageKit to install the necessary language pack when one of those
>   locales was chosen

Your usage of the term "language pack" is interesting. locales-all only
provide locales, not translations. So a user wanting to switch to the
desktop to German will still have to install the corresponding l10n
packages for firefox, thunderbird or libreoffice.

Would that work to improve d-i to allow users select multiple language
tasks, so that they can install supports for the languages they want?

> However, it's consistent with how e.g. Flatpak handles locales (there's one
> locale extension per language code, so for example fr_FR and fr_CH go
> together).
> 
> This would also allow avoiding a long-standing issue with Steam: some
> Steam games assume that en_US.UTF-8 is always available (they're wrong,
> and should be using C.UTF-8, but that's not portable), so the steam package
> could gain a Recommends: locales-en to work around that.
> 
> > > locales-utf8 would probably also be enough for many locale-sensitive
> > > packages' test suites.
> > 
> > Not sure about that. Test suites are the main reason why we had to
> > revert the removal of non UTF-8 locales.
> 
> I suspect this might be a bit circular: the reason that upstreams want
> to test support for legacy encodings, and the reason that we want to run
> those tests instead of skipping them, is because distros like us still
> (claim to) support those encodings, even though we no longer recommend
> them.

I agree, however they way we want to break the loop (not offering legacy
encodings to users) suggests that the test suites are going to be the
last users of the non-utf8 locales.

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-09 Thread Simon McVittie
On Sun, 09 Jan 2022 at 13:48:06 +0100, Aurelien Jarno wrote:
> On 2022-01-06 11:21, Simon McVittie wrote:
> > * install locales-all (this costs > 200M but ensures that all locales are
> >   available)
> > 
> > For "reasonably large" desktop and server systems, I wonder whether it
> > might be better to generate a subset of locales-all with just the UTF-8
> > locales that we recommend for general use, and install that by default?
> 
> Defining general use is something quite difficult. All languages and
> countries should be considered equally, so we could differentiate
> UTF-8 from non UTF-8 locales, but we should not make further selection.

Right, what I meant was: AIUI we recommend that all speakers of xx_YY
use the xx_YY.utf8 locale, as opposed to a legacy national encoding, so
we could (make it straightforward to) install all the UTF-8 locales
like en_GB.utf8 and none of the legacy national encodings like
en_GB.ISO-8859-15.

> That way of doing it would be fine from the desktop point of view (100M
> is not that much compared to a desktop environment). However we can't
> force the installation of locales-all-utf8 in d-i

I thought task-*-desktop could maybe pull it in?

> From the various discussion on IRC, we more or less concluded that the
> way to go is to have one locale package per language, like it's done in
> most other distributions. From there we could have task-$language
> depends on locales-$language, also simplifying the d-i side.
> 
> Would that work for your use case?

That would mean that UIs like gnome-control-center would still not be able
to offer to add (for example) a French locale on a system that had been
installed in German, unless either the user knows that they need to install
the French language pack first, or the UI grows distro-specific code to:

- know which languages would be candidates for being enabled if the
  appropriate language pack was installed
- ask PackageKit to install the necessary language pack when one of those
  locales was chosen

However, it's consistent with how e.g. Flatpak handles locales (there's one
locale extension per language code, so for example fr_FR and fr_CH go
together).

This would also allow avoiding a long-standing issue with Steam: some
Steam games assume that en_US.UTF-8 is always available (they're wrong,
and should be using C.UTF-8, but that's not portable), so the steam package
could gain a Recommends: locales-en to work around that.

> > locales-utf8 would probably also be enough for many locale-sensitive
> > packages' test suites.
> 
> Not sure about that. Test suites are the main reason why we had to
> revert the removal of non UTF-8 locales.

I suspect this might be a bit circular: the reason that upstreams want
to test support for legacy encodings, and the reason that we want to run
those tests instead of skipping them, is because distros like us still
(claim to) support those encodings, even though we no longer recommend
them.

smcv



Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-09 Thread Aurelien Jarno
Hi,

On 2022-01-06 11:21, Simon McVittie wrote:
> Package: locales-all
> Version: 2.33-1
> Severity: wishlist
> 
> As discussed recently on -devel and previously in #701585, at the moment
> Debian users have a choice between two non-ideal locale setups:
> 
> * install locales and generate a subset of locale files with locale-gen
>   (this is optimal for small systems, but it's difficult for high-level
>   UIs like GNOME Settings to present this to users, particularly in a
>   non-distro-specific way)

Yes, this is the old way of doing that, and it's something that we want
to get rid of at some point. I think it's important thing to take into
account when discussing the future of locales-all. 

> * install locales-all (this costs > 200M but ensures that all locales are
>   available)
> 
> For "reasonably large" desktop and server systems, I wonder whether it
> might be better to generate a subset of locales-all with just the UTF-8
> locales that we recommend for general use, and install that by default?

Defining general use is something quite difficult. All languages and
countries should be considered equally, so we could differentiate
UTF-8 from non UTF-8 locales, but we should not make further selection.

> If I'm counting correctly, that would be about 100M, which is perhaps an
> acceptable price to pay for language settings being straightforward -
> a reasonably complete set of Noto fonts (without CJK) is already more
> than half of that.
> 
> locales-all could have a Depends on locales-utf8 and contain the remaining
> (legacy national character set) locales, if anyone still needs that.

That way of doing it would be fine from the desktop point of view (100M
is not that much compared to a desktop environment). However we can't
force the installation of locales-all-utf8 in d-i, so that wouldn't
solve the problematic of getting rid of the locales package.

From the various discussion on IRC, we more or less concluded that the
way to go is to have one locale package per language, like it's done in
most other distributions. From there we could have task-$language
depends on locales-$language, also simplifying the d-i side.

Would that work for your use case?

> locales-utf8 would probably also be enough for many locale-sensitive
> packages' test suites.

Not sure about that. Test suites are the main reason why we had to
revert the removal of non UTF-8 locales.

Regards,
Aurelien

-- 
Aurelien Jarno  GPG: 4096R/1DDD8C9B
aurel...@aurel32.net http://www.aurel32.net



Bug#1003213: locales-all: introduce locales-utf8 package?

2022-01-06 Thread Simon McVittie
Package: locales-all
Version: 2.33-1
Severity: wishlist

As discussed recently on -devel and previously in #701585, at the moment
Debian users have a choice between two non-ideal locale setups:

* install locales and generate a subset of locale files with locale-gen
  (this is optimal for small systems, but it's difficult for high-level
  UIs like GNOME Settings to present this to users, particularly in a
  non-distro-specific way)

* install locales-all (this costs > 200M but ensures that all locales are
  available)

For "reasonably large" desktop and server systems, I wonder whether it
might be better to generate a subset of locales-all with just the UTF-8
locales that we recommend for general use, and install that by default?

If I'm counting correctly, that would be about 100M, which is perhaps an
acceptable price to pay for language settings being straightforward -
a reasonably complete set of Noto fonts (without CJK) is already more
than half of that.

locales-all could have a Depends on locales-utf8 and contain the remaining
(legacy national character set) locales, if anyone still needs that.

locales-utf8 would probably also be enough for many locale-sensitive
packages' test suites.

smcv