Re: [gentoo-dev] UTF-8 locale by default

2012-12-31 Thread Maxim Kammerer
Hi,

stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
/usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt

Working with those (e.g., backup) probably requires a UTF-8 locale. Is
this considered acceptable? Did anyone notice?

-- 
Maxim Kammerer
Liberté Linux: http://dee.su/liberte



Re: [gentoo-dev] UTF-8 locale by default

2012-12-31 Thread Zac Medico
On 12/31/2012 09:14 AM, Maxim Kammerer wrote:
 Hi,
 
 stage3 now includes non-ASCII paths, via app-misc/ca-certificates -- e.g.:
 /usr/share/ca-certificates/mozilla/TÜBİTAK_UEKAE_Kök_Sertifika_Hizmet_Sağlayıcısı_-_Sürüm_3.crt
 
 Working with those (e.g., backup) probably requires a UTF-8 locale. Is
 this considered acceptable? Did anyone notice?

It's been that way for a very long time (over a year). Since bug #382199
[1], portage uses a constant UTF-8 encoding for all installed files
regardless of the locale, so at least you can count on portage handling
those UTF-8 names even if you don't have a UTF-8 locale configured.

[1] https://bugs.gentoo.org/show_bug.cgi?id=382199
-- 
Thanks,
Zac



Re: [gentoo-dev] UTF-8 locale by default

2012-08-07 Thread Dan Douglas
On Friday, August 03, 2012 07:16:45 AM Luca Barbato wrote:
 On 07/27/2012 07:24 PM, Mike Frysinger wrote:
  yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
only 
  real option in my mind for making unicode the default.  any other 
  amalgamations of various locales is ugly as sin.
 
 When they meet? I'd be fine with a pre-release =P
 
 lu
 

2008 TC1 is just finishing up balloting as we speak. If this isn't already in 
there you may be in for a long wait. Feel free to subscribe to the austin-
group lists -- It's open to anyone. A calendar with the teleconference 
schedule is available.
--
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Matthew Summers
On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert flop...@gentoo.org wrote:
 On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
 flamee...@flameeyes.eu wrote:
 On 01/08/2012 23:42, Fabian Groffen wrote:
 Honestly, if some asian person has whatever charset that I often find in
 spam messages, but is not UTF-8, are you then going to tell that person
 to switch to UTF-8 to get those python packages emerged?  I hope not.

 Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
 available, but doesn't set in by default - Python stuff fails to build
 or test - not going to be fixed with change your locale reasoning.

 Is it mental? Yes.
 Would I like that to change? Yes.
 Do I care ẃhether that's through the use of cluebyfour on the Python
 team or by setting an utf-8 locale by default? Not in the least.


 Please apply the cluebyfour to the upstream developers of python and
 python modules. :-)

 I do try to fix unicode problems if I run into them. However,
 sometimes it just isn't worth the effort.


Python upstream is doing what they think is best in using unicode.

That said, what if we just temporarily set a locale in the ebuild for
running tests and elsewhere? Is this unreasonable or impossible? It
might not be a great solution, this method, since users' stuff will
still break.

Further, I support the use of C.UTF-8 when it is ready. It seems like
the lowest common denominator to me.


-- 
Matthew W. Summers
Gentoo Foundation Inc.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Michał Górny
On Fri, 3 Aug 2012 09:59:42 -0500
Matthew Summers quantumsumm...@gentoo.org wrote:

 On Thu, Aug 2, 2012 at 1:32 PM, Mike Gilbert flop...@gentoo.org
 wrote:
  On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
  flamee...@flameeyes.eu wrote:
  On 01/08/2012 23:42, Fabian Groffen wrote:
  Honestly, if some asian person has whatever charset that I often
  find in spam messages, but is not UTF-8, are you then going to
  tell that person to switch to UTF-8 to get those python packages
  emerged?  I hope not.
 
  Tell that to the Python team I guess. My tinderbox _has_ utf8
  locales available, but doesn't set in by default - Python stuff
  fails to build or test - not going to be fixed with change your
  locale reasoning.
 
  Is it mental? Yes.
  Would I like that to change? Yes.
  Do I care ẃhether that's through the use of cluebyfour on the
  Python team or by setting an utf-8 locale by default? Not in the
  least.
 
 
  Please apply the cluebyfour to the upstream developers of python and
  python modules. :-)
 
  I do try to fix unicode problems if I run into them. However,
  sometimes it just isn't worth the effort.
 
 
 Python upstream is doing what they think is best in using unicode.
 
 That said, what if we just temporarily set a locale in the ebuild for
 running tests and elsewhere? Is this unreasonable or impossible? It
 might not be a great solution, this method, since users' stuff will
 still break.

It is impossible because you can't know which locale a particular
system has available. AFAIK there's no 'it-will-always-work' choice;
unless we're going to enforce generating some common locale, or do very
ugly things.

 
 Further, I support the use of C.UTF-8 when it is ready. It seems like
 the lowest common denominator to me.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Alexis Ballier
On Fri, 3 Aug 2012 17:47:24 +0200
Michał Górny mgo...@gentoo.org wrote:
  Python upstream is doing what they think is best in using unicode.
  
  That said, what if we just temporarily set a locale in the ebuild
  for running tests and elsewhere? Is this unreasonable or
  impossible? It might not be a great solution, this method, since
  users' stuff will still break.
 
 It is impossible because you can't know which locale a particular
 system has available. AFAIK there's no 'it-will-always-work' choice;
 unless we're going to enforce generating some common locale, or do
 very ugly things.

I don't think anyone will object to enforcing a given locale to be
present, even en_US.UTF-8; people will object if they have to use that
locale.

Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

A.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-03 Thread Diego Elio Pettenò
On 03/08/2012 09:54, Alexis Ballier wrote:
 I don't think anyone will object to enforcing a given locale to be
 present, even en_US.UTF-8; people will object if they have to use that
 locale.
 
 Maybe locale-gen can even generate it on-the-fly in $T, I don't know.

Agreed. And there _is_ a way to tell which locales are available:
`locale -a`.

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Fabian Groffen
On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
 Diego mentioned the python issue.

Honestly, if some asian person has whatever charset that I often find in
spam messages, but is not UTF-8, are you then going to tell that person
to switch to UTF-8 to get those python packages emerged?  I hope not.

There is a difference between there is a UTF-8 locale available on the
system and en_US.UTF-8 locale is in effect.

Fabian

-- 
Fabian Groffen
Gentoo on a different level


signature.asc
Description: Digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Stelian Ionescu
On Thu, 2012-08-02 at 08:42 +0200, Fabian Groffen wrote:
 On 01-08-2012 21:00:23 -0400, Mike Gilbert wrote:
  Diego mentioned the python issue.
 
 Honestly, if some asian person has whatever charset that I often find in
 spam messages, but is not UTF-8, are you then going to tell that person
 to switch to UTF-8 to get those python packages emerged?  I hope not.

Yes.

-- 
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib



signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Diego Elio Pettenò
On 01/08/2012 23:42, Fabian Groffen wrote:
 Honestly, if some asian person has whatever charset that I often find in
 spam messages, but is not UTF-8, are you then going to tell that person
 to switch to UTF-8 to get those python packages emerged?  I hope not.

Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
available, but doesn't set in by default - Python stuff fails to build
or test - not going to be fixed with change your locale reasoning.

Is it mental? Yes.
Would I like that to change? Yes.
Do I care ẃhether that's through the use of cluebyfour on the Python
team or by setting an utf-8 locale by default? Not in the least.

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Mike Gilbert
On Thu, Aug 2, 2012 at 2:21 PM, Diego Elio Pettenò
flamee...@flameeyes.eu wrote:
 On 01/08/2012 23:42, Fabian Groffen wrote:
 Honestly, if some asian person has whatever charset that I often find in
 spam messages, but is not UTF-8, are you then going to tell that person
 to switch to UTF-8 to get those python packages emerged?  I hope not.

 Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
 available, but doesn't set in by default - Python stuff fails to build
 or test - not going to be fixed with change your locale reasoning.

 Is it mental? Yes.
 Would I like that to change? Yes.
 Do I care ẃhether that's through the use of cluebyfour on the Python
 team or by setting an utf-8 locale by default? Not in the least.


Please apply the cluebyfour to the upstream developers of python and
python modules. :-)

I do try to fix unicode problems if I run into them. However,
sometimes it just isn't worth the effort.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Alexis Ballier
On Thu, 02 Aug 2012 11:21:40 -0700
Diego Elio Pettenò flamee...@flameeyes.eu wrote:

 On 01/08/2012 23:42, Fabian Groffen wrote:
  Honestly, if some asian person has whatever charset that I often
  find in spam messages, but is not UTF-8, are you then going to tell
  that person to switch to UTF-8 to get those python packages
  emerged?  I hope not.
 
 Tell that to the Python team I guess. My tinderbox _has_ utf8 locales
 available, but doesn't set in by default - Python stuff fails to
 build or test - not going to be fixed with change your locale
 reasoning.

not that it is hard to set LC_ALL=sth before running the failing
command, or make the pm do it... we already fix regexp bugs with other
locales (or workaround them by setting LC_ALL=C), it falls under the
same category.
you just need to teach people, and maybe mandate an utf8 locale to be
present; the same way they do not consider estonian alphabet ordering
'broken' they would not consider not having an utf8 locale 'broken',
esp. when said utf8 is far from being optimal in terms of size for asian
languages.

A.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Kent Fredric
On 31 July 2012 05:33, Michael Orlitzky mich...@orlitzky.com wrote:
 On 07/30/12 12:28, Michał Górny wrote:

 My point here is that you want the thing to change. So you first try to
 convince people here to change. We practically did a small survey here
 and in the result we didn't agree on doing the change.

 So you're saying we should do another survey on another group, hoping
 that this time the result will be on your side.

 We didn't do a survey, we asked,

   Is there a reason for not using at least en_US.UTF-8 as a sane
default value?

 Unsurprisingly, the responses contained reasons for not using
 en_US.UTF-8 as the default.


I think its a shame that :

1. the current handbook way to change timezone is manually editing a file.
2. the handbook doesn't mention `eselect locale`
3. `eselect locale list` is useless if you have *all* locales available to you.
4. `eselect locale` can only set the LANG variable.
5. that eselect doesn't have an interactive mode yet.

Why? because this problem could be made simpler by providing a way to
use a recommended locale for your timezone, which is likely to yield a
more sane default for that timezone.

It would also make it easier to validate the value the user chooses
for their Timezone value.

Consider:

eselect timezone list
 # all level 1 timezones + groups , ie: like ls /usr/share/zoneinfo
eselect timezone list  America/
# contents of /usr/share/zoneinfo/America
eselect timezone set America/Chicago
# /etc/timezone is updated to  'America/Chicago'
# /etc/localtime is replaced with /usr/share/zoneinfo/America/Chicago
eselect locale set --all auto
# LANG and LC_* are set using the values defined as default for
America/Chicago
eselect locale set --ctype auto
# Only LC_CTYPE is autopopulated.
eselect locale list
# 600 items because you have a vanilla locale.defs
eselect locale list --timezone
# shows a list of LOCALE values for the current TZ, with the one that
would be used as default first/marked up differently
eselect locale list en
# shows english locale options
eselect locale set --ctype en_US.utf8


The benefits of setting these locales this way are obvious to me at
least, you can set locales to a value that is sensible automatically.
You also can validate a users choice of locale and provide feedback,
such as, you can list non-installed locales, and then tell the user if
thy try to use a locale that isn't installed yet they need to update
locales.def

The only way I can suggest something better, would be an interactive
locale setter, something like 'tzselect' , except sets timezone *and*
locale information, with the ability to automatically update
locales.def and add new locale definitions and regenerate the locale
database.

This way, you could have a selection process more like this:

https://gist.github.com/3240866

#? 1

The following information has been given:

United States
Eastern Time

Therefore TZ='America/New_York' will be used.
Local time is now: Thu Aug 2 17:33:17 EDT 2012.
Universal Time is now: Thu Aug 2 21:33:17 UTC 2012.
Is the above information OK?
1) Yes
2) No
#? 1
Your Current locale settings are:

LANG=POSIX

The recommended settings for your locale are :
LANG=en_US.utf8
LC_CTYPE=en_US.utf8

Do you wish to change your locale settings at this time?
1) No
2) Yes - Use recommended settings
3) Yes - Configure locale interactively.

At least this way, the effort required to configure your system into a
very good logical UTF8 default is trivial.

-- 
Kent

perl -e  print substr( \edrgmaM  SPA NOcomil.ic\\@tfrken\, \$_ * 3,
3 ) for ( 9,8,0,7,1,6,5,4,3,2 );

http://kent-fredric.fox.geek.nz



Re: [gentoo-dev] UTF-8 locale by default

2012-08-02 Thread Luca Barbato
On 07/27/2012 07:24 PM, Mike Frysinger wrote:
 yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
 only 
 real option in my mind for making unicode the default.  any other 
 amalgamations of various locales is ugly as sin.

When they meet? I'd be fine with a pre-release =P

lu




Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Andreas K. Huettel

 
 If it turns out that C or POSIX is the most common response, we should
 then default the locale to en_US.UTF-8 if we really want to default to
 a UTF-8 setting. The reason being it makes sense to have the default
 locale set to the country of origin, which in our case is the United
 States.
 

Given the number of Gentoo devs (especially on the desktop side where this 
matters most) from other parts of the world, that's not really a valid 
argument. In particular in cases as e.g. Paper size setting, where basically 
US stubbornness stands against the rest of the planet.

-- 

Andreas K. Huettel
Gentoo Linux developer 
dilfri...@gentoo.org
http://www.akhuettel.de/



signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Michael Orlitzky
On 08/01/12 16:18, Andreas K. Huettel wrote:
 

 If it turns out that C or POSIX is the most common response, we should
 then default the locale to en_US.UTF-8 if we really want to default to
 a UTF-8 setting. The reason being it makes sense to have the default
 locale set to the country of origin, which in our case is the United
 States.

 
 Given the number of Gentoo devs (especially on the desktop side where this 
 matters most) from other parts of the world, that's not really a valid 
 argument. In particular in cases as e.g. Paper size setting, where 
 basically 
 US stubbornness stands against the rest of the planet.
 

Every locale is wrong for somebody; the idea was that by taking a
survey, you could make it wrong for the least amount of people (by default).

If the majority of users use a stupid paper size, the best default is
still whatever they use regardless of any personal preferences.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Walter Dnes
On Wed, Aug 01, 2012 at 04:29:42PM -0400, Michael Orlitzky wrote

 Every locale is wrong for somebody; the idea was that by taking
 a survey, you could make it wrong for the least amount of people
 (by default).

  Question... has anybody ever considered that maybe a POSIX locale
is wrong for the least amount of people???  There's also a very damning
statement in the post that started this thread...

On Thu, Jul 19, 2012 at 11:39:59PM +0200, Sascha Cunz wrote
 I recently discovered that I for some reason haven't noticed the
 warning about setting the locale to utf-8 in the gentoo handbook for
 obviously several years; thus i was still running all my systems in
 a POSIX locale since i never cared much about it.
 
 However, since I noticed, I talked to several people about it; all
 of them stating as first response: Not shipping with a utf-8 locale
 turned on by default nowadays probably is a bug in your distro

  That's right... the poster was running a POSIX locale for several
years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  Then several
people said Not shipping with a utf-8 locale turned on by default
nowadays probably is a bug in your distro.  And suddenly it's a
problem.  What's next?  Despite running with no problems for many years
with a separate /usr and no initramfs, will we have several people
come along and tell us that it's a bug in our distro?  Oh... wait...

  The fact that other distros do it does not constitute justification
for us to do it.  If I wanted to run Redhat or Ubuntu, I'd run Redhat or
Ubuntu.  We're ignoring a very basic question here... what problems does
shipping with a POSIX locale cause that would be fixed by setting a UTF8
default locale???  I want a real answer.  Not something along the lines
of But daddy, all the other kids are doing it.

-- 
Walter Dnes waltd...@waltdnes.org



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Mike Gilbert
On Wed, Aug 1, 2012 at 8:20 PM, Walter Dnes waltd...@waltdnes.org wrote:
 We're ignoring a very basic question here... what problems does
 shipping with a POSIX locale cause that would be fixed by setting a UTF8
 default locale???  I want a real answer.  Not something along the lines
 of But daddy, all the other kids are doing it.


Try reading the rest of the thread before posting a rant.

Diego mentioned the python issue. As well, there are many test suites
that malfunction without a UTF-8 or en_US.UTF-8 locale. If you hunt
through Bugzilla, you can probably dig up other issues.



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Peter Stuge
Walter Dnes wrote:
 The fact that other distros do it does not constitute
 justification for us to do it.

Unfortunately that exact reason, along with Fedora is doing it, was
cited by a very active developer as reason to reject technical points
which I tried to make a few times.

But that is off-topic. Let's leave it for later. All I'm saying is
don't underestimate pack mentality.


//Peter



Re: [gentoo-dev] UTF-8 locale by default

2012-08-01 Thread Sergey Popov
02.08.2012 04:20, Walter Dnes wrote:
   That's right... the poster was running a POSIX locale for several
 years ***AND DID NOT HAVE ANY PROBLEMS RELATED TO IT***.  
This discussion is very similar with one, that i have seen in Russian
Linux community some years ago about migrating from ru_RU.KOI8-R to
ru_RU.UTF-8. Arguments from KOI8-R guys were the same - Why we should
change something if it works? and they are also did not notice
fundamental problems with some vitally important packages, which can not
be replaced or need to be heavily patched to work properly. Arguments
from UTF-8 guys were not ideal, but locale change brokes only old or
unsupported packages, so they win.

P.S. I do not think that comparison with 'initramfs and separate /usr
problem' is correct in this case. Default locale change is evolution,
not revolution...



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-31 Thread Michael Orlitzky
On 07/30/12 15:02, Walter Dnes wrote:
 Would forcing UTF-8 cause problems for packages that expect
 specific ISO encodings in X fonts?

Not that I know of (and setting a default wouldn't force anything).

xfreecell's readme states Make sure there is a font named 7x14 and
another thread mentions that this is provided by
media-fonts/font-misc-misc so that sounds like a bug in the ebuild to me.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/27/12 16:16, Aaron W. Swenson wrote:
 
 No user will be happy with whatever we decide to use as a default.

The defaults should be what's best for the most people, with a bias
towards safety. Why don't we just take a survey and choose the most
common utf8 response?



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 10:35 AM, Michael Orlitzky mich...@orlitzky.com wrote:
 On 07/27/12 16:16, Aaron W. Swenson wrote:

 No user will be happy with whatever we decide to use as a default.

 The defaults should be what's best for the most people, with a bias
 towards safety. Why don't we just take a survey and choose the most
 common utf8 response?

You'd really want to a which do you prefer, which can you use
survey, then; You don't really want to choose the result preferred by
the most people, rather you want the result which is usable by the
most people.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michał Górny
On Mon, 30 Jul 2012 10:35:36 -0400
Michael Orlitzky mich...@orlitzky.com wrote:

 On 07/27/12 16:16, Aaron W. Swenson wrote:
  
  No user will be happy with whatever we decide to use as a default.
 
 The defaults should be what's best for the most people, with a bias
 towards safety. Why don't we just take a survey and choose the most
 common utf8 response?

How can you take a survey like that? How will you ensure it actually
hits the majority? How will you define the majority?

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/30/12 10:41, Michał Górny wrote:
 On Mon, 30 Jul 2012 10:35:36 -0400
 Michael Orlitzky mich...@orlitzky.com wrote:
 
 On 07/27/12 16:16, Aaron W. Swenson wrote:

 No user will be happy with whatever we decide to use as a default.

 The defaults should be what's best for the most people, with a bias
 towards safety. Why don't we just take a survey and choose the most
 common utf8 response?
 
 How can you take a survey like that? How will you ensure it actually
 hits the majority? How will you define the majority?
 

Considering that the alternative is to force everyone to change it
manually, you can do it however you want and it'll be an improvement.

  1) Create a webpage with a bunch of options, count the results

  2) Ask the g.o mailing lists, count responses manually

  3) Use google docs like the website survey that went out a few days
 ago

It won't hit everyone, but no survey ever does. As long as you get a
large enough unbiased sample, it doesn't matter. And anything would be
an improvement, so it doesn't matter anyway.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny mgo...@gentoo.org wrote:
 On Mon, 30 Jul 2012 10:35:36 -0400
 Michael Orlitzky mich...@orlitzky.com wrote:

 On 07/27/12 16:16, Aaron W. Swenson wrote:
 
  No user will be happy with whatever we decide to use as a default.

 The defaults should be what's best for the most people, with a bias
 towards safety. Why don't we just take a survey and choose the most
 common utf8 response?

 How can you take a survey like that? How will you ensure it actually
 hits the majority? How will you define the majority?

Serverside script on gentoo.org. Push out a news item with the URL and
a last-call date. Tabulate the results, using browser fingerprints to
weed out the bulk of duplicates.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Rich Freeman
On Mon, Jul 30, 2012 at 10:42 AM, Michael Mol mike...@gmail.com wrote:

 You'd really want to a which do you prefer, which can you use
 survey, then; You don't really want to choose the result preferred by
 the most people, rather you want the result which is usable by the
 most people.

I tend to agree.  Donnie said something in his manifesto which I think
applies here: any of the proposed solutions is probably better than
doing nothing.

If I forget to tweak my locale and I end up with a comma as a decimal
mark it isn't the end of the world, and neither is some output in
metric units.  I've ended up working on many a global system where
times get reported in GMT and people put up with the inconvenience
because they realize that any standard is better than no standard.

What is the real end-user impact of any of this stuff anyway?  During
the install the thing that matters is being able to partition disks
and compile kernels and such.  I doubt that too many users will be
dependent on installer locale settings for displaying weather reports
or such.  If they don't set locale, then it is like not setting
localtime - you just get to live with some default.  I would imagine
that at least by having a UTF-8 locale users would be able to do
things like set full names of users using unicode, etc.

Rich



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Aaron W. Swenson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 07/30/2012 11:04 AM, Michael Mol wrote:
 On Mon, Jul 30, 2012 at 10:41 AM, Michał Górny mgo...@gentoo.org
 wrote:
 On Mon, 30 Jul 2012 10:35:36 -0400 Michael Orlitzky
 mich...@orlitzky.com wrote:
 
 On 07/27/12 16:16, Aaron W. Swenson wrote:
 
 No user will be happy with whatever we decide to use as a
 default.
 
 The defaults should be what's best for the most people, with a
 bias towards safety. Why don't we just take a survey and choose
 the most common utf8 response?
 
 How can you take a survey like that? How will you ensure it
 actually hits the majority? How will you define the majority?
 
 Serverside script on gentoo.org. Push out a news item with the URL
 and a last-call date. Tabulate the results, using browser
 fingerprints to weed out the bulk of duplicates.
 

I still advocate continuing how we have been.

However, the survey should be one question: What is the output of
`locale' on your workstation/desktop/laptop?

The less painful we make the survey, the more respondents we'll get,
and the less biased the results will be. Additionally, it makes the
responses easy to parse with a script.

Servers are excluded because special things take place there that may
not actually line up with what the user prefers.

If it turns out that C or POSIX is the most common response, we should
then default the locale to en_US.UTF-8 if we really want to default to
a UTF-8 setting. The reason being it makes sense to have the default
locale set to the country of origin, which in our case is the United
States.

Yes, it may irk those whose native locale is not en_US.UTF-8, but like
I said, no one will be happy. Except for those whose native locale
happens to be the default.

Start at a default, doesn't really matter which as long as the default
is the lingua franca of international business, and instruct the user,
as we already do, how to change it during the setup.

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email: titanof...@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAWrXAACgkQVxOqA9G7/aCmowD6A8+9giw1BhhxvAag7Cmeom7o
mHVW49AfEDSo6ReknZkBAIa09FZ62SU66BCCi6m3Qisk5SW7P3YDLNbkMDS38/CZ
=lFc0
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michał Górny
On Mon, 30 Jul 2012 10:50:29 -0400
Michael Orlitzky mich...@orlitzky.com wrote:

 On 07/30/12 10:41, Michał Górny wrote:
  On Mon, 30 Jul 2012 10:35:36 -0400
  Michael Orlitzky mich...@orlitzky.com wrote:
  
  On 07/27/12 16:16, Aaron W. Swenson wrote:
 
  No user will be happy with whatever we decide to use as a default.
 
  The defaults should be what's best for the most people, with a bias
  towards safety. Why don't we just take a survey and choose the most
  common utf8 response?
  
  How can you take a survey like that? How will you ensure it actually
  hits the majority? How will you define the majority?
  
 
 Considering that the alternative is to force everyone to change it
 manually, you can do it however you want and it'll be an improvement.

My point here is that you want the thing to change. So you first try to
convince people here to change. We practically did a small survey here
and in the result we didn't agree on doing the change.

So you're saying we should do another survey on another group, hoping
that this time the result will be on your side.

   1) Create a webpage with a bunch of options, count the results
 
   2) Ask the g.o mailing lists, count responses manually
 
   3) Use google docs like the website survey that went out a few days
  ago
 
 It won't hit everyone, but no survey ever does. As long as you get a
 large enough unbiased sample, it doesn't matter. And anything would be
 an improvement, so it doesn't matter anyway.

It depends on who the 'unbiased sample' is. Are you interested only in
opinion of Gentoo users who visit the website? Who sync once a day?
Once a week? Who follow Gentoo Planet? Who participate in the forums?

We can create the survey and announce it everywhere. But it still won't
catch many old-time Gentoo users who can actually have something
opposite to say. It won't be unbiased.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Mol
On Mon, Jul 30, 2012 at 12:28 PM, Michał Górny mgo...@gentoo.org wrote:
 On Mon, 30 Jul 2012 10:50:29 -0400
 Michael Orlitzky mich...@orlitzky.com wrote:

 On 07/30/12 10:41, Michał Górny wrote:
  On Mon, 30 Jul 2012 10:35:36 -0400
  Michael Orlitzky mich...@orlitzky.com wrote:
 
  On 07/27/12 16:16, Aaron W. Swenson wrote:
 
  No user will be happy with whatever we decide to use as a default.
 
  The defaults should be what's best for the most people, with a bias
  towards safety. Why don't we just take a survey and choose the most
  common utf8 response?
 
  How can you take a survey like that? How will you ensure it actually
  hits the majority? How will you define the majority?
 

 Considering that the alternative is to force everyone to change it
 manually, you can do it however you want and it'll be an improvement.

 My point here is that you want the thing to change. So you first try to
 convince people here to change. We practically did a small survey here
 and in the result we didn't agree on doing the change.

 So you're saying we should do another survey on another group, hoping
 that this time the result will be on your side.

   1) Create a webpage with a bunch of options, count the results

   2) Ask the g.o mailing lists, count responses manually

   3) Use google docs like the website survey that went out a few days
  ago

 It won't hit everyone, but no survey ever does. As long as you get a
 large enough unbiased sample, it doesn't matter. And anything would be
 an improvement, so it doesn't matter anyway.

 It depends on who the 'unbiased sample' is. Are you interested only in
 opinion of Gentoo users who visit the website? Who sync once a day?
 Once a week? Who follow Gentoo Planet? Who participate in the forums?

 We can create the survey and announce it everywhere. But it still won't
 catch many old-time Gentoo users who can actually have something
 opposite to say. It won't be unbiased.

I was thinking about this, and I suspect that a survey period of 1-2
months is likely fine. It should also be enough to scoop up people who
run servers and monitor those servers for security updates.

-- 
:wq



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Michael Orlitzky
On 07/30/12 12:28, Michał Górny wrote:
 
 My point here is that you want the thing to change. So you first try to
 convince people here to change. We practically did a small survey here
 and in the result we didn't agree on doing the change.
 
 So you're saying we should do another survey on another group, hoping
 that this time the result will be on your side.

We didn't do a survey, we asked,

  Is there a reason for not using at least en_US.UTF-8 as a sane
   default value?

Unsurprisingly, the responses contained reasons for not using
en_US.UTF-8 as the default.

Don't take my original reply out of context, I don't actually care what
we have as the default.


 
 It depends on who the 'unbiased sample' is. Are you interested only in
 opinion of Gentoo users who visit the website? Who sync once a day?
 Once a week? Who follow Gentoo Planet? Who participate in the forums?
 
 We can create the survey and announce it everywhere. But it still won't
 catch many old-time Gentoo users who can actually have something
 opposite to say. It won't be unbiased.

The technical objection to C.UTF-8 is that it's non-standard, Ok. What
are the technical objections to LC_CTYPE=en_US.UTF-8? If the
alternatives are all improvements, the statistics are irrelevant.



Re: [gentoo-dev] UTF-8 locale by default

2012-07-30 Thread Walter Dnes
On Mon, Jul 30, 2012 at 01:33:48PM -0400, Michael Orlitzky wrote

 The technical objection to C.UTF-8 is that it's non-standard, Ok.
 What are the technical objections to LC_CTYPE=en_US.UTF-8? If the
 alternatives are all improvements, the statistics are irrelevant.

  I ran into a problem several months ago with xfreecell not running.
Turned out the ISO8859-1 fonts were not being generated, just UTF-8.
xfreecell needs ISO8859-1 fonts.  And it's not the only package.  I
modified xorg-2.eclass so that font packages would build ISO8859-1.  See
http://article.gmane.org/gmane.linux.gentoo.user/252316/ for the gory
details.  Would forcing UTF-8 cause problems for packages that expect
specific ISO encodings in X fonts?

  The important part of the eclass mod was to manually enable iso8859-1
and disable all other encodings...

if grep -q -s disable-all-encodings ${ECONF_SOURCE:-.}/configure; then
FONT_OPTIONS+=
--enable-iso8859-1
--disable-iso10646
--disable-iso10646-1
--disable-iso8859-2
--disable-iso8859-3
--disable-iso8859-4
--disable-iso8859-5
--disable-iso8859-6
--disable-iso8859-7
--disable-iso8859-8
--disable-iso8859-9
--disable-iso8859-10
--disable-iso8859-11
--disable-iso8859-12
--disable-iso8859-13
--disable-iso8859-14
--disable-iso8859-15
--disable-iso8859-16
--disable-jisx0201
--disable-koi8-r
else
FONT_OPTIONS+=
--disable-iso10646
--disable-iso10646-1
--disable-iso8859-2
--disable-iso8859-3
--disable-iso8859-4
--disable-iso8859-5
--disable-iso8859-6
--disable-iso8859-7
--disable-iso8859-8
--disable-iso8859-9
--disable-iso8859-10
--disable-iso8859-11
--disable-iso8859-12
--disable-iso8859-13
--disable-iso8859-14
--disable-iso8859-15
--disable-iso8859-16
--disable-jisx0201
--disable-koi8-r
fi

-- 
Walter Dnes waltd...@waltdnes.org



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Ben de Groot
On 20 July 2012 06:28, Ulrich Mueller u...@gentoo.org wrote:
 On Thu, 19 Jul 2012, Sascha Cunz wrote:

 Is there a reason for not using at least en_US.UTF-8 as a sane
 default value?

 Because there's no one-size-fits-all locale, but it is specific to
 every system so the user must configure it?

While this is understandable, the fact remains that not having a
UTF-8 locale by default in our stage3 environment is sub-optimal.

I understand why the council rejected Debian's C.UTF-8 option,
but is there really no better default that we can use?

Without any default locale set, in practically all cases that means
that the user is presented with English, and mostly the American
variant. So, in practice, we are defaulting to en_US, just not in a
unicode environment. Correct me if I'm wrong.

Also, in most other places (such as our website, GLEPs, ebuilds)
we default to en_US.UTF-8.

So let's upgrade to en_US.UTF-8, which is for most users more
desirable than the current situation. Of course we will still advise
them to set their desired locales in /etc/locale.gen. But at least
they will start with a unicode environment, as expected anno 2012.


 The matter was recently discussed in this mailing list [1] and also in
 the March 2012 council meeting [2], and as a result the docs team has
 amended the respective section [3] of the handbook.

 Ulrich

 [1] 
 http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml
 [2] http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt
 [3] http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1chap=8


-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Ulrich Mueller
 On Fri, 27 Jul 2012, Ben de Groot wrote:

 I understand why the council rejected Debian's C.UTF-8 option,
 but is there really no better default that we can use?

 Without any default locale set, in practically all cases that means
 that the user is presented with English, and mostly the American
 variant. So, in practice, we are defaulting to en_US, just not in a
 unicode environment. Correct me if I'm wrong.

See below. We're not defaulting to en_US for things like the number
format.

 Also, in most other places (such as our website, GLEPs, ebuilds)
 we default to en_US.UTF-8.

 So let's upgrade to en_US.UTF-8, which is for most users more
 desirable than the current situation. Of course we will still advise
 them to set their desired locales in /etc/locale.gen. But at least
 they will start with a unicode environment, as expected anno 2012.

As I had pointed out before [1], changing from POSIX to an en_US
locale will have undesirable side effects, like commas as thousands
separators in numbers (because of LC_NUMERIC). Also the defaults of
en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

So if we change the default (but I still don't see the need), we
should go for a less intrusive setting like:

   LANG=POSIX
   LC_CTYPE=en_US.utf8

Ulrich

[1] 
http://archives.gentoo.org/gentoo-dev/msg_56a438adde8efebd467ada5f858048ba.xml



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Rick Zero_Chaos Farina
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/27/2012 03:08 AM, Ulrich Mueller wrote:
 
 As I had pointed out before [1], changing from POSIX to an en_US
 locale will have undesirable side effects, like commas as thousands
 separators in numbers (because of LC_NUMERIC). Also the defaults of
 en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
 
 So if we change the default (but I still don't see the need), we
 should go for a less intrusive setting like:
 
LANG=POSIX
LC_CTYPE=en_US.utf8

I would love to see a utf8 default, if the above is agreeable then I say +1

- -Zero
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJQEkD6AAoJEKXdFCfdEflKt8MP/3wRoExV11rO5aV5952hwKhd
x9AG3wGJQqGFLkKW++gU1RLX8rhxZE+W8cRlp3/4Q1b6yLGFp7UihZv/rQj1SJra
Uz4OWqzzdYAkfkzr2MOgB94iODXInuuSbZmhcvOg8d7cgbhW3p0aIQ59uqkqom6W
U0a8BohmGtTEMvWurMtvz705atv0z8aRUsoBUkagCUmRqg96j8HJRbMibNFKcHaa
tzilNblkCouPmh5VZNuoCNIVrs6ADOT+kXmhZ8DeuOOdM88irPr41gz557K97J4l
u9ZWElpLY8zse+dHSioybE57cb9ISNph9B3OjmrzEmxMYO/Vs8+8ZRIgX4A4U2FZ
BDISvf2u77ZUhv48gCuC6pj+np7IMAUgRgk1xWiSkPIWxvlcPcvFo/K1dle3FofL
iNAxf0XcLj+crfBemhnvDWTB0ZCIIBcyn0MYax70lzcwR0t0q+xJ8XBN1hF3xWob
LOUSCd1sibc2a65D5olc/qKSjINM5KY3D+CVXhojhD1YzklmrKBb9K5gk6ziZr2y
w4OMOIkDc+iHYq0xhcYRAJU38+cuX9ViNq9O4H3ILpQXi+KRKlk4PmlLIm2v9evb
P+JNsRSl+1sxUkn2ZthBh+83vj/WtnR0s1sXEzc+6riBomBGsc0Hbsoa9Z+JgNhF
FzvV5OHsfNiuHvAzayww
=ZiLb
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Dan Douglas
On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
  On Fri, 27 Jul 2012, Ben de Groot wrote:
 
  I understand why the council rejected Debian's C.UTF-8 option,
  but is there really no better default that we can use?
 
  Without any default locale set, in practically all cases that means
  that the user is presented with English, and mostly the American
  variant. So, in practice, we are defaulting to en_US, just not in a
  unicode environment. Correct me if I'm wrong.
 
 See below. We're not defaulting to en_US for things like the number
 format.
 
  Also, in most other places (such as our website, GLEPs, ebuilds)
  we default to en_US.UTF-8.
 
  So let's upgrade to en_US.UTF-8, which is for most users more
  desirable than the current situation. Of course we will still advise
  them to set their desired locales in /etc/locale.gen. But at least
  they will start with a unicode environment, as expected anno 2012.
 
 As I had pointed out before [1], changing from POSIX to an en_US
 locale will have undesirable side effects, like commas as thousands
 separators in numbers (because of LC_NUMERIC). Also the defaults of
 en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
 
 So if we change the default (but I still don't see the need), we
 should go for a less intrusive setting like:
 
LANG=POSIX
LC_CTYPE=en_US.utf8
 
 Ulrich
 

You're concerned about the commas breaking things? Given that you usually need 
to specifically ask for them (i.e., printf ' flag), and that kind of output is 
usually going to be for human consumption only that seems unlikely. If 
anything does rely upon the format, can't tolerate different locales, and fails 
to specify LC_NUMERIC then it's broken anyway.

LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying 
defaults for some people. What do users of other distros think? Is this really 
a serious problem for anyone?

LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8 
by default. I can live with LANG=POSIX.
-- 
Dan Douglas

signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Ben de Groot
On 27 July 2012 16:06, Dan Douglas orm...@gmail.com wrote:
 On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
  On Fri, 27 Jul 2012, Ben de Groot wrote:

  I understand why the council rejected Debian's C.UTF-8 option,
  but is there really no better default that we can use?

  Without any default locale set, in practically all cases that means
  that the user is presented with English, and mostly the American
  variant. So, in practice, we are defaulting to en_US, just not in a
  unicode environment. Correct me if I'm wrong.

 See below. We're not defaulting to en_US for things like the number
 format.

  Also, in most other places (such as our website, GLEPs, ebuilds)
  we default to en_US.UTF-8.

  So let's upgrade to en_US.UTF-8, which is for most users more
  desirable than the current situation. Of course we will still advise
  them to set their desired locales in /etc/locale.gen. But at least
  they will start with a unicode environment, as expected anno 2012.

 As I had pointed out before [1], changing from POSIX to an en_US
 locale will have undesirable side effects, like commas as thousands
 separators in numbers (because of LC_NUMERIC). Also the defaults of
 en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

 So if we change the default (but I still don't see the need), we
 should go for a less intrusive setting like:

LANG=POSIX
LC_CTYPE=en_US.utf8

 Ulrich


 You're concerned about the commas breaking things? Given that you usually need
 to specifically ask for them (i.e., printf ' flag), and that kind of output is
 usually going to be for human consumption only that seems unlikely. If
 anything does rely upon the format, can't tolerate different locales, and 
 fails
 to specify LC_NUMERIC then it's broken anyway.

 LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more annoying
 defaults for some people. What do users of other distros think? Is this really
 a serious problem for anyone?

 LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is getting utf8
 by default. I can live with LANG=POSIX.
 --
 Dan Douglas

How about the below?

LANG=en_GB.utf8
LC_COLLATE=C
LC_CTYPE=en_GB.utf8

That will give us A4 paper size and the metric system. If LC_NUMERIC is
really a problem, we can set it to something more desirable.
-- 
Cheers,

Ben | yngwin
Gentoo developer
Gentoo Qt project lead, Gentoo Wiki admin



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Cyprien Nicolas
Ulrich Mueller wrote:
 On Fri, 27 Jul 2012, Ben de Groot wrote:

 So let's upgrade to en_US.UTF-8, which is for most users more
 desirable than the current situation. Of course we will still advise
 them to set their desired locales in /etc/locale.gen. But at least
 they will start with a unicode environment, as expected anno 2012.
 
 As I had pointed out before [1], changing from POSIX to an en_US
 locale will have undesirable side effects, like commas as thousands
 separators in numbers (because of LC_NUMERIC). Also the defaults of
 en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

For this very reason by system locale is en_IE.UTF-8. Still English but
using Euro Monetary, Metric units, A4 paper, etc.

It might suit needs for most European installs, but not for everyone.

-- 
Cyprien / Fulax
Gentoo Lisp Project contributor




Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Michał Górny
On Fri, 27 Jul 2012 10:38:30 +0200
Cyprien Nicolas c.nico...@gmail.com wrote:

 Ulrich Mueller wrote:
  On Fri, 27 Jul 2012, Ben de Groot wrote:
 
  So let's upgrade to en_US.UTF-8, which is for most users more
  desirable than the current situation. Of course we will still
  advise them to set their desired locales in /etc/locale.gen. But
  at least they will start with a unicode environment, as expected
  anno 2012.
  
  As I had pointed out before [1], changing from POSIX to an en_US
  locale will have undesirable side effects, like commas as thousands
  separators in numbers (because of LC_NUMERIC). Also the defaults of
  en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
 
 For this very reason by system locale is en_IE.UTF-8. Still English
 but using Euro Monetary, Metric units, A4 paper, etc.
 
 It might suit needs for most European installs, but not for everyone.

Still uses ',' for thousands sep.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Michał Górny
On Fri, 27 Jul 2012 16:34:01 +0800
Ben de Groot yng...@gentoo.org wrote:

 On 27 July 2012 16:06, Dan Douglas orm...@gmail.com wrote:
  On Friday, July 27, 2012 09:08:36 AM Ulrich Mueller wrote:
   On Fri, 27 Jul 2012, Ben de Groot wrote:
 
   I understand why the council rejected Debian's C.UTF-8 option,
   but is there really no better default that we can use?
 
   Without any default locale set, in practically all cases that
   means that the user is presented with English, and mostly the
   American variant. So, in practice, we are defaulting to en_US,
   just not in a unicode environment. Correct me if I'm wrong.
 
  See below. We're not defaulting to en_US for things like the number
  format.
 
   Also, in most other places (such as our website, GLEPs, ebuilds)
   we default to en_US.UTF-8.
 
   So let's upgrade to en_US.UTF-8, which is for most users more
   desirable than the current situation. Of course we will still
   advise them to set their desired locales in /etc/locale.gen. But
   at least they will start with a unicode environment, as expected
   anno 2012.
 
  As I had pointed out before [1], changing from POSIX to an en_US
  locale will have undesirable side effects, like commas as thousands
  separators in numbers (because of LC_NUMERIC). Also the defaults of
  en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
 
  So if we change the default (but I still don't see the need), we
  should go for a less intrusive setting like:
 
 LANG=POSIX
 LC_CTYPE=en_US.utf8
 
  Ulrich
 
 
  You're concerned about the commas breaking things? Given that you
  usually need to specifically ask for them (i.e., printf ' flag),
  and that kind of output is usually going to be for human
  consumption only that seems unlikely. If anything does rely upon
  the format, can't tolerate different locales, and fails to specify
  LC_NUMERIC then it's broken anyway.
 
  LC_MONETARY / LC_MEASUREMENT as en_US are probably slightly more
  annoying defaults for some people. What do users of other distros
  think? Is this really a serious problem for anyone?
 
  LC_CTYPE=en_US.utf8 would be a bare minimum. The important bit is
  getting utf8 by default. I can live with LANG=POSIX.
  --
  Dan Douglas
 
 How about the below?
 
 LANG=en_GB.utf8
 LC_COLLATE=C
 LC_CTYPE=en_GB.utf8
 
 That will give us A4 paper size and the metric system. If LC_NUMERIC
 is really a problem, we can set it to something more desirable.

LC_NUMERIC=pl_PL.utf8

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Chí-Thanh Christopher Nguyễn
Ulrich Mueller schrieb:
 As I had pointed out before [1], changing from POSIX to an en_US
 locale will have undesirable side effects, like commas as thousands
 separators in numbers (because of LC_NUMERIC). Also the defaults of
 en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.

 So if we change the default (but I still don't see the need), we
 should go for a less intrusive setting like:

LANG=POSIX
LC_CTYPE=en_US.utf8

This would be better than LANG=en_US.utf8 but I would still prefer not
to have any country/region attached to the locale. The C.UTF-8 locale
which Debian uses for this purpose (a UTF-8 locale without side effects)
appears more suitable to me.


Best regards,
Chí-Thanh Christopher Nguyễn




Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Mike Frysinger
On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
 Ulrich Mueller schrieb:
  As I had pointed out before [1], changing from POSIX to an en_US
  locale will have undesirable side effects, like commas as thousands
  separators in numbers (because of LC_NUMERIC). Also the defaults of
  en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
  
  So if we change the default (but I still don't see the need), we
  
  should go for a less intrusive setting like:
 LANG=POSIX
 LC_CTYPE=en_US.utf8
 
 This would be better than LANG=en_US.utf8 but I would still prefer not
 to have any country/region attached to the locale. The C.UTF-8 locale
 which Debian uses for this purpose (a UTF-8 locale without side effects)
 appears more suitable to me.

yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the only 
real option in my mind for making unicode the default.  any other 
amalgamations of various locales is ugly as sin.
-mike


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Pacho Ramos
El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
 On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn wrote:
  Ulrich Mueller schrieb:
   As I had pointed out before [1], changing from POSIX to an en_US
   locale will have undesirable side effects, like commas as thousands
   separators in numbers (because of LC_NUMERIC). Also the defaults of
   en_US for LC_MEASUREMENT and LC_PAPER are only useful in the U.S.
   
   So if we change the default (but I still don't see the need), we
   
   should go for a less intrusive setting like:
  LANG=POSIX
  LC_CTYPE=en_US.utf8
  
  This would be better than LANG=en_US.utf8 but I would still prefer not
  to have any country/region attached to the locale. The C.UTF-8 locale
  which Debian uses for this purpose (a UTF-8 locale without side effects)
  appears more suitable to me.
 
 yes, and i'm waiting on the POSIX group to formalize C.UTF-8.  that's the 
 only 
 real option in my mind for making unicode the default.  any other 
 amalgamations of various locales is ugly as sin.
 -mike

Do you have any idea about how much time could that formalization take?
If it will take a long time, maybe we could go to that amalgamations :-/


signature.asc
Description: This is a digitally signed message part


Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Aaron W. Swenson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 07/27/2012 02:29 PM, Pacho Ramos wrote:
 El vie, 27-07-2012 a las 13:24 -0400, Mike Frysinger escribió:
 On Friday 27 July 2012 08:13:16 Chí-Thanh Christopher Nguyễn
 wrote:
 Ulrich Mueller schrieb:
 As I had pointed out before [1], changing from POSIX to an
 en_US locale will have undesirable side effects, like commas
 as thousands separators in numbers (because of LC_NUMERIC).
 Also the defaults of en_US for LC_MEASUREMENT and LC_PAPER
 are only useful in the U.S.
 
 So if we change the default (but I still don't see the need),
 we
 
 should go for a less intrusive setting like: LANG=POSIX 
 LC_CTYPE=en_US.utf8
 
 This would be better than LANG=en_US.utf8 but I would still
 prefer not to have any country/region attached to the locale.
 The C.UTF-8 locale which Debian uses for this purpose (a UTF-8
 locale without side effects) appears more suitable to me.
 
 yes, and i'm waiting on the POSIX group to formalize C.UTF-8.
 that's the only real option in my mind for making unicode the
 default.  any other amalgamations of various locales is ugly as
 sin. -mike
 
 Do you have any idea about how much time could that formalization
 take? If it will take a long time, maybe we could go to that
 amalgamations :-/
 

Really, how much of an inconvenience is it that we don't use UTF-8 as
a default?

In my mind, it is sufficient that we instruct users how to set the
locale in the handbook.

No user will be happy with whatever we decide to use as a default. I
will be especially upset if we use the metric system instead of the
*STANDARD* system. It has 'standard' in the name for a reason people.
(^_^)

- -- 
Mr. Aaron W. Swenson
Gentoo Linux Developer
Email: titanof...@gentoo.org
GnuPG FP : 2C00 7719 4F85 FB07 A49C  0E31 5713 AA03 D1BB FDA0
GnuPG ID : D1BBFDA0
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAlAS9xEACgkQVxOqA9G7/aDXmQEAmKW1MNgHDZpjE0JBWsWssq0h
LR32rvm0CrafIhD6v3UA/Aiuq6BTGxfJ3pO6+pP5xtQ5RD0ML5+89sSfKX6R1DEo
=JtMV
-END PGP SIGNATURE-



Re: [gentoo-dev] UTF-8 locale by default

2012-07-27 Thread Diego Elio Pettenò
Il 27/07/2012 13:16, Aaron W. Swenson ha scritto:
 Really, how much of an inconvenience is it that we don't use UTF-8 as
 a default?

Given that there are a ton and a half of Python packages that do not
work with a non-utf8 locale, I'd say it's quite a thing.

So either we go with an UTF-8 default or somebody has to fix the
packages not working without it

-- 
Diego Elio Pettenò — Flameeyes
flamee...@flameeyes.eu — http://blog.flameeyes.eu/



signature.asc
Description: OpenPGP digital signature


[gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Sascha Cunz
I recently discovered that I for some reason haven't noticed the warning about 
setting the locale to utf-8 in the gentoo handbook for obviously several 
years; thus i was still running all my systems in a POSIX locale since i never 
cared much about it.

However, since I noticed, I talked to several people about it; all of them 
stating as first response: Not shipping with a utf-8 locale turned on by 
default nowadays probably is a bug in your distro.

While thinking about this and recognizing that indeed recent distributions 
ship with some UTF-8 locale by default, I tend to agree on that statement.

Though, google brings up a lot of good documentation about how to change the 
locale, I couldn't find something that tells why stage3 is still delivered 
with posix locale set.

Is there a reason for not using at least en_US.UTF-8 as a sane default 
value?

BR,
SaCu



Re: [gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Chí-Thanh Christopher Nguyễn
Sascha Cunz schrieb:
 Is there a reason for not using at least en_US.UTF-8 as a sane default 
 value?

It has been discussed some time ago already. Setting LANG=en_US.UTF-8
would mess with collation rules, measurementpaper units etc. which has
the potential to make users outside USA unhappy.

It might make sense to set LC_CTYPE=en_US.UTF8 but even so,
transliteration may give you unexpected results.

To illustrate this, try running

echo äå | LC_CTYPE=en_US.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=da_DK.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8
echo äå | LC_CTYPE=de_DE.UTF-8 iconv -t ASCII//TRANSLIT -f UTF-8

and compare the output.
For the previous discussion, see this thread:
http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml


Best regards,
Chí-Thanh Christopher Nguyễn



Re: [gentoo-dev] UTF-8 locale by default

2012-07-19 Thread Ulrich Mueller
 On Thu, 19 Jul 2012, Sascha Cunz wrote:

 Is there a reason for not using at least en_US.UTF-8 as a sane
 default value?

Because there's no one-size-fits-all locale, but it is specific to
every system so the user must configure it?

The matter was recently discussed in this mailing list [1] and also in
the March 2012 council meeting [2], and as a result the docs team has
amended the respective section [3] of the handbook.

Ulrich

[1] 
http://archives.gentoo.org/gentoo-dev/msg_2ffb7ea72e6209439600c371f6fc071d.xml
[2] http://www.gentoo.org/proj/en/council/meeting-logs/20120313.txt
[3] http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1chap=8