[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2012-06-05 Thread STINNER Victor

STINNER Victor  added the comment:

> Either the code is incorrect in 3.1
> or the documentation should be updated.

Leaving LC_CTYPE unchanged (use the "C" locale, which is ASCII in most
cases) at Python startup would be a major change in Python 3. I don't
want to change this. You would see a lot of mojibake in your GUIs and get a lot 
of ugly surrogate characters in filenames (because of the PEP
393) if we don't set the LC_CTYPE to the user preferred encoding at startup 
anymore.

Setting the LC_CTYPE to the user preferred encoding is just very
convinient and helps Python to speak to the user though the console,
to the filesystem, to pass arguments on a command line of a
subprocess, etc. For example, you cannot pass non-ASCII characters to
a subprocess, characters written by the user in your GUI, if your
current LC_CTYPE locale is C (ASCII): you get an Unicode encode error.

So it's just a documentation issue: see my attached patch.

--
keywords: +patch
Added file: http://bugs.python.org/file25830/locale_doc.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-11-06 Thread Petri Lehtinen

Petri Lehtinen  added the comment:

If the thread safety of setlocale() is a problem, does anybody know how 
portable uselocale() is? It sets the locale of the current thread only, so it's 
safe to temporarily change the locale and then set it back.

--
nosy: +petri.lehtinen

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-08-10 Thread R. David Murray

R. David Murray  added the comment:

Yes a new issue would be more appropriate.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-08-10 Thread Alexis Metaireau

Alexis Metaireau  added the comment:

I see two different things here:

1) the fact that getlocale() doesn't return (None, None) on some python 
versions
2) the fact that having it returning (None, None) by default is a bit 
misleading as users may think that getlocale() is tied to environment 
variables. That's what was at the origin of #12699

My last remark is about the second bit. Maybe should I start a new issue 
for this?

--
title: 3.x locale does not default to C,contrary to the documentation 
and to 2.x behavior -> 3.x locale does not default to C, contrary to the 
documentation and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-08-09 Thread R. David Murray

R. David Murray  added the comment:

This issue is about the fact that it doesn't return (None, None).  We should 
probably decide what we are going to do about that before changing the docs if 
they need it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-08-09 Thread Alexis Metaireau

Alexis Metaireau  added the comment:

Maybe could it be useful to specify in the documentation that getlocale() is 
not intended to be used to get information about what is the locale of the 
system? 

It's not explained currently and thus it's a bit weird to have getlocale 
returning (None, None) even if you have your locales set.

--
nosy: +alexis

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-29 Thread Steffen Daode Nurpmeso

Steffen Daode Nurpmeso  added the comment:

User lemburg pointed me to this, but no, i've posted msg127416 to Issue 11022.

--
nosy: +sdaoden

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

More likely, it's my email reader. Sorry about that.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Arfrever Frehtes Taifersar Arahesis

Arfrever Frehtes Taifersar Arahesis  added the comment:

Martin v. Löwis:
It seems that your web browser replaces ", " with ",\t" in the title (where 
"\t" is a tab character) each time you add a comment.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

>> That would be highly non-portable, and repeat the mistakes of
>> getdefaultlocale.
> 
> You say that often, but I don't really know why. It's certainly portable
> between various Unix platforms, perhaps not Windows, but then i18n
> on Windows is a different story altogether.

No, it's absolutely not portable across Unix platforms. Looking at
LANG or LC_ALL does *not* allow you to infer the region name, or
the locale's character set. For example, using glibc, in some
installations, /etc/locale.alias is considered to map a value of LANG
to the final locale name. As an option, glibc also considers a
LOCALE_ALIAS_PATH that may point to a (colon-separated) path of
files to search for locale aliases.

Other systems may use other databases to map a locale name to locale
properties.

Unless you know exactly what version of C library is running on
a system, parsing environment variables yourself is doomed to fail.

--
title: 3.x locale does not default to C, contrary to the documentation and to 
2.x behavior -> 3.x locale does not default to C, contrary to the documentation 
and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis :


--
title: 3.x locale does not default to C,contrary to the documentation 
and to 2.x behavior -> 3.x locale does not default to C, contrary to the 
documentation and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Martin v. Löwis wrote:
> 
> Martin v. Löwis  added the comment:
> 
>> An clean alternative would be adding LC_* variable parsing code to
>> Python to avoid the setlocale() call altogether.
> 
> That would be highly non-portable, and repeat the mistakes of
> getdefaultlocale.

You say that often, but I don't really know why. It's certainly portable
between various Unix platforms, perhaps not Windows, but then i18n
on Windows is a different story altogether.

BTW: For Windows, you can adjust setlocale() to work thread-based
using: _configthreadlocale()
(http://msdn.microsoft.com/de-de/library/26c0tb7x(v=vs.80).aspx)

Perhaps we ought to expose this in _locale and use it in
getdefaultlocal() on Windows to query the locale settings
via the pseudocode I posted.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> An clean alternative would be adding LC_* variable parsing code to
> Python to avoid the setlocale() call altogether.

That would be highly non-portable, and repeat the mistakes of
getdefaultlocale.

--
title: 3.x locale does not default to C, contrary to the documentation and to 
2.x behavior -> 3.x locale does not default to C, contrary to the documentation 
and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-28 Thread Marc-Andre Lemburg

Marc-Andre Lemburg  added the comment:

Python can be embedded into other applications and unconditionally
changing the locale (esp. the LC_CTYPE) is not good practice, since
it's not thread-safe and affects the entire process. An application
may have set LC_CTYPE (or the locale) to something completely
different.

If at all, Python should be more careful using this call (pseudo
code):

lc_ctype = setlocale(LC_CTYPE, NULL);
if (lc_ctype == NULL || strcmp(lc_ctype, "") || strcmp(lc_ctype, "C")) {
env_lc_ctype = setlocale(LC_CTYPE, "");
setlocale(LC_CTYPE, lc_ctype);
lc_ctype = env_lc_ctype;
}

Then use lc_ctype to figure out encodings, etc.

While this is not thread-safe, it at least reverts the change back
to the original setting and only applies the change if needed. That's
still not optimal, but better than nothing.

An clean alternative would be adding LC_* variable parsing code to
Python to avoid the setlocale() call altogether.

--
nosy: +lemburg

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-27 Thread Arfrever Frehtes Taifersar Arahesis

Changes by Arfrever Frehtes Taifersar Arahesis :


--
nosy: +Arfrever

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2011-01-27 Thread STINNER Victor

STINNER Victor  added the comment:

> To add a little bit more analysis: posix.device_encoding requires that
> the LC_CTYPE is set. Setting it just in this function would not be
> possible, as setlocale is not thread-safe.

open() does indirectly (locale.getpreferredencoding()) change temporary the 
locale (set LC_CTYPE to "") if the file is not a TTY (if it is a TTY, 
device_encoding() calls nl_langinfo(CODESET) without changing the current 
locale). If setlocale() is not thread-safe we have (maybe?) a problem here. See 
also #11022: report of an user not understanding why setlocale() doesn't impact 
open() (TextIOWrapper) encoding). A quick solution is to call 
locale.getpreferredencoding(False) which doesn't change the locale.

Do you really need os.device_encoding()? If we change TextIOWrapper to call 
locale.getpreferredencoding(False), os.device_encoding() and 
locale.getpreferredencoding(False) will give the same result. Except on 
Windows: os.device_encoding() uses GetConsoleCP() if fd==0 and 
GetConsoleOutputCP() if fd in (1, 2). But we can use GetConsoleCP() and 
GetConsoleOutputCP() directly in initstdio(). If someone closes sys.std* and 
recreate them later: os.device_encoding() can be use explicitly to keep the 
previous behaviour.

> It would still be better it is was unset afterwards. Third-party
> extensions could have LC_CTYPE-dependent behaviour.

If Python is embeded, it should not change the locale. Even if it is not 
embeded, it is maybe better to never set LC_CTYPE.

It is too late to touch such critical point in Python 3.2, but we may change it 
in Python 3.3.

--
nosy: +haypo
versions: +Python 3.3 -Python 3.2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-12-29 Thread R. David Murray

Changes by R. David Murray :


--
versions: +Python 3.2 -Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-09 Thread Antoine Pitrou

Changes by Antoine Pitrou :


--
assignee:  -> georg.brandl

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-09 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

To add a little bit more analysis: posix.device_encoding requires that
the LC_CTYPE is set. Setting it just in this function would not be
possible, as setlocale is not thread-safe.

So for 3.1, it seems that Python must set LC_CTYPE. If somebody can
propose a patch that avoids that for 3.2, I'd be certainly in favor.

--
assignee: loewis -> 

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread R. David Murray

R. David Murray  added the comment:

Since it controls what is considered to be whitespace, it is possible
this will lead to subtle bugs, but I agree that it seems relatively
benign, especially considering 3.x's unicode orientation.  So, this
becomes a doc bug...

--
components:  -Library (Lib)
priority: release blocker -> high

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> In principle, they could, yes - but what specific behavior might that
> be? What will change is character classification, which I consider
> fairly harmless. Also, multi-byte conversion routines will change, which
> is the primary reason for leaving it modified.

Ok, so I suppose we could leave the code as-is.

--
title: 3.x locale does not default to C,contrary to the documentation 
and to 2.x behavior -> 3.x locale does not default to C, contrary to the 
documentation and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread Martin v . Löwis

Martin v. Löwis  added the comment:

> It would still be better it is was unset afterwards. Third-party
> extensions could have LC_CTYPE-dependent behaviour.

In principle, they could, yes - but what specific behavior might that
be? What will change is character classification, which I consider
fairly harmless. Also, multi-byte conversion routines will change, which
is the primary reason for leaving it modified.

--
title: 3.x locale does not default to C, contrary to the documentation and to 
2.x behavior -> 3.x locale does not default to C, contrary to the documentation 
and to 2.x behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

It would still be better it is was unset afterwards. Third-party
extensions could have LC_CTYPE-dependent behaviour.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread R. David Murray

R. David Murray  added the comment:

Ah, I can tell you exactly why that is, then.  I noticed this in
pythonrun.c while grepping the source:

#ifdef HAVE_SETLOCALE
/* Set up the LC_CTYPE locale, so we can obtain
   the locale's charset without having to switch
   locales. */
setlocale(LC_CTYPE, "");
#endif

SVN blames Martin in r56922, so this case is assigned appropriately. 
Perhaps changing only LC_CTYPE is safe?  I must admit to ignorance as to
what all the LC variables mean/control.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

For some reason only LC_CTYPE is affected:

>>> locale.getlocale(locale.LC_CTYPE)
('fr_FR', 'UTF8')
>>> locale.getlocale(locale.LC_MESSAGES)
(None, None)
>>> locale.getlocale(locale.LC_TIME)
(None, None)
>>> locale.getlocale(locale.LC_NUMERIC)
(None, None)
>>> locale.getlocale(locale.LC_COLLATE)
(None, None)

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread R. David Murray

R. David Murray  added the comment:

This is definately a bug in 3.1, for the same reason that a C program
uses the C locale until an explicit setlocale is done: otherwise, a
non-locale-aware program can run into bugs resulting from locale issues
when run under a different locale than that of the program author.

I have a memory of this being reported before somewhere and someone
tracking it down to a change in python initialization, but I can't find
a bug report and my google-foo is failing me.

--
nosy: +r.david.murray
priority: normal -> release blocker
stage:  -> needs patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-08 Thread Georg Brandl

Georg Brandl  added the comment:

Deferring to Martin which one is correct :)

--
assignee: georg.brandl -> loewis
nosy: +loewis

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-06 Thread Ezio Melotti

Ezio Melotti  added the comment:

Confirmed for 3.1, 3.0 still returns (None, None).

--
components: +Library (Lib)
nosy: +ezio.melotti
priority:  -> normal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6203] 3.x locale does not default to C, contrary to the documentation and to 2.x behavior

2009-06-05 Thread Ned Deily

New submission from Ned Deily :

In the Library Reference section 22.2.1 for locale, it states:

"Initially, when a program is started, the locale is the C locale, no 
matter what the user’s preferred locale is. The program must explicitly 
say that it wants the user’s preferred locale settings by calling 
setlocale(LC_ALL, '')."

This is the case for python2.x:

$ export LANG=en_US.UTF-8
$ python2.5
Python 2.5.4 (r254:67916, Feb 17 2009, 20:16:45) 
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale; locale.getlocale()
(None, None)
>>> locale.getdefaultlocale()
('en_US', 'UTF8')
>>> 

but not for 3.1:
$ python3.1
Python 3.1a1+ (py3k, Mar 23 2009, 00:12:12) 
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale; locale.getlocale()
('en_US', 'UTF8')
>>> locale.getdefaultlocale()
('en_US', 'UTF8')
>>> 

Either the code is incorrect in 3.1 or the documentation should be 
updated.

--
assignee: georg.brandl
components: Documentation
messages: 88932
nosy: georg.brandl, nad
severity: normal
status: open
title: 3.x locale does not default to C, contrary to the documentation and to 
2.x behavior
type: behavior
versions: Python 3.1

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com