[issue23425] Windows getlocale unix-like with french, german, portuguese, spanish
albertjan added the comment: Hi, Thanks for your replies. Eryksun (nice to meet you here too!), your function seems very useful, thank you very much. I had indeed already switched to your 'getrawlocale' approach. Perhaps off-topic (because I have never seen this happen in Windows), but locale.getlocale() sometimes returns (None, None), *even if* locale.setlocale(locale.LC_ALL, ) has been called at the start of the program. For some reason, LANG, LC_ALL and possible other vars are sometimes not set correctly (I know this is not Python's fault, but...). Would it be a good idea to have a 'failsafe' parameter in getlocale? Something like: def safe_getlocale(failsafe=False): current_locale = locale.getlocale() if failsafe and current_locale[0] is None and not sys.platform.startswith(win): os.environ[LANG] = en_US.UTF-8 os.environ[LC_ALL] = en_US.UTF-8 current_locale = locale.getlocale() return current_locale (sorry for squeezing this in the current issue!) Albert-Jan -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23425 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23425] Windows getlocale unix-like with french, german, portuguese, spanish
albertjan added the comment: I agree that the two issues are related, but I don't see how they could be duplicates. But maybe that's because I do not know the underlying code. issue 10466 is mostly about getdefaultlocale() and whether it's desirable or not that its return value is always uniq-esque, including on windows. The failed call to locale.py*) as a script would demonstrate that the getdefaultlocale() return value ought to be platform-specific and ready for consumption by setlocale(). That's how I read that issue. I personally find it useful to have getdefaultlocale() --a nice, harmonized locale string. With getlocale in Windows, however, the return value is sometimes unix-like, sometimes windows-specific. Until a couple of days ago I thought getlocale was entirely platform-specific. Why should locale.setlocale(locale.LC_ALL, ..join(locale.getlocale())) succeed on my Dutch windows system, but fail on my neighbour's German windows system? In my humble opinion: -setlocale should return nothing. It's a setter -getlocale should return a platform-specific locale specification, probably what is currently returned by setlocale. The output should be ready for consumption by setlocale. -getdefaultlocale should ALWAYS return a harmonized/unix-like locale specification. In Unix, but not in Windows, it could be used as an argument for setlocale. My two cents. Best wishes, Albert-Jan *) which also fails on Python 2.7 and 3.4 on my Dutch Windows 7 64, btw. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23425 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23425] Windows getlocale unix-like with french, german, portuguese, spanish
New submission from albertjan: getlocale() is supposed to (?) return a locale two-tuple in a platform-specific notation. However, in *Windows* 7 64, with Python 3.4, 3.3 and 2.7 a *unix-like*, abbreviated, lang_territory notation is used for french, german, portuguese, spanish. In other words: In these four cases, the output of setlocale is not equal to ..join(locale.getlocale()) ## Code that demonstrates the differences from __future__ import print_function import locale import collections import pprint languages = (chinese czech danish dutch english finnish french german greek hungarian icelandic italian japanese korean norwegian polish portuguese russian slovak spanish swedish turkish) d = collections.defaultdict(list) t = collections.namedtuple(Locale, lang setlocale getlocale) for language in languages.split(): sloc = locale.setlocale(locale.LC_ALL, language) gloc = locale.getlocale() record = t(language, sloc, gloc) if gloc[0][2] == _: d[unix-like].append(record) else: d[windows-like].append(record) pprint.pprint(dict(d)) ## output n:\C:\Miniconda3\python.exe N:\temp\loc.py - {'unix-like': [Locale(lang='french', setlocale='French_France.1252', getlocale=('fr_FR', 'cp1252')), Locale(lang='german', setlocale='German_Germany.1252', getlocale=('de_DE', 'cp1252')), Locale(lang='portuguese', setlocale='Portuguese_Brazil.1252', getlocale=('pt_BR', 'cp1252')), Locale(lang='spanish', setlocale='Spanish_Spain.1252', getlocale=('es_ES', 'cp1252'))], - 'windows-like': [Locale(lang='chinese', setlocale=Chinese (Simplified)_People's Republic of China.936, getlocale=(Chinese (Simplified)_People's Republic of China, '936')), Locale(lang='czech', setlocale='Czech_Czech Republic.1250', getlocale=('Czech_Czech Republic', '1250')), Locale(lang='danish', setlocale='Danish_Denmark.1252', getlocale=('Danish_Denmark', '1252')), Locale(lang='dutch', setlocale='Dutch_Netherlands.1252', getlocale=('Dutch_Netherlands', '1252')), Locale(lang='english', setlocale='English_United States.1252', getlocale=('English_United States', '1252')), Locale(lang='finnish', setlocale='Finnish_Finland.1252', getlocale=('Finnish_Finland', '1252')), Locale(lang='greek', setlocale='Greek_Greece.1253', getlocale=('Greek_Greece', '1253')), Locale(lang='hungarian', setlocale='Hungarian_Hungary.1250', getlocale=('Hungarian_Hungary', '1250')), Locale(lang='icelandic', setlocale='Icelandic_Iceland.1252', getlocale=('Icelandic_Iceland', '1252')), Locale(lang='italian', setlocale='Italian_Italy.1252', getlocale=('Italian_Italy', '1252')), Locale(lang='japanese', setlocale='Japanese_Japan.932', getlocale=('Japanese_Japan', '932')), Locale(lang='korean', setlocale='Korean_Korea.949', getlocale=('Korean_Korea', '949')), Locale(lang='norwegian', setlocale='Norwegian (Bokmål)_Norway.1252', getlocale=('Norwegian (Bokmål)_Norway', '1252')), Locale(lang='polish', setlocale='Polish_Poland.1250', getlocale=('Polish_Poland', '1250')), Locale(lang='russian', setlocale='Russian_Russia.1251', getlocale=('Russian_Russia', '1251')), Locale(lang='slovak', setlocale='Slovak_Slovakia.1250', getlocale=('Slovak_Slovakia', '1250')), Locale(lang='swedish', setlocale='Swedish_Sweden.1252', getlocale=('Swedish_Sweden', '1252')), Locale(lang='turkish', setlocale='Turkish_Turkey.1254', getlocale=('Turkish_Turkey', '1254'))]} -- components: Library (Lib) messages: 235630 nosy: fo...@yahoo.com priority: normal severity: normal status: open title: Windows getlocale unix-like with french, german, portuguese, spanish type: behavior versions: Python 2.7, Python 3.3, Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23425 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20999] setlocale, getlocale succession -- ValueError or (None, None)
albertjan added the comment: Ok, I know this is closed as a duplicate, but I am pasting some additional info here for reference. All info is about the FIRST system of the original message ## The locale settings fomcls-Mac-Pro:Desktop fomcl$ locale LANG= LC_COLLATE=C LC_CTYPE=UTF-8 LC_MESSAGES=C LC_MONETARY=C LC_NUMERIC=C LC_TIME=C LC_ALL= ## export LANG=en_US.UTF-8 does not fix it. fomcls-Mac-Pro:Desktop fomcl$ export LANG=en_US.UTF-8 fomcl-Mac-Pro:Desktop fomcl$ python Python 2.7.2 (default, Jun 20 2012, 16:23:33) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin import locale locale.getdefaultlocale() Traceback (most recent call last): File stdin, line 1, in module File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 496, in getdefaultlocale return _parse_localename(localename) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 428, in _parse_localename raise ValueError, 'unknown locale: %s' % localename ValueError: unknown locale: UTF-8 locale.setlocale(locale.LC_ALL, ) 'en_US.UTF-8/UTF-8/en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/en_US.UTF-8' locale.getlocale() Traceback (most recent call last): File stdin, line 1, in module File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 515, in getlocale return _parse_localename(localename) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 428, in _parse_localename raise ValueError, 'unknown locale: %s' % localename ValueError: unknown locale: UTF-8 ## export LC_ALL=en_US.UTF-8 makes it all work as expected/desired. fomcls-Mac-Pro:Desktop fomcl$ export LC_ALL=en_US.UTF-8 fomcls-Mac-Pro:Desktop fomcl$ python Python 2.7.2 (default, Jun 20 2012, 16:23:33) [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin import locale locale.getdefaultlocale() ('en_US', 'UTF-8') locale.getlocale() (None, None) locale.setlocale(locale.LC_ALL, ) 'en_US.UTF-8' locale.getlocale() ('en_US', 'UTF-8') -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20999 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue20999] setlocale, getlocale succession -- ValueError or (None, None)
New submission from albertjan: - see also issue #18378 # Result applies to Python 2.7.2 and Python 3.3.4 # Mac OS X Mountain Lion 10.9.1 on Virtualbox with a Linux Debian AMD-64 host fomcls-Mac-Pro:~ fomcl$ uname -a Darwin fomcls-Mac-Pro.local 12.2.0 Darwin Kernel Version 12.2.0: Sat Aug 25 00:48:52 PDT 2012; root:xnu-2050.18.24~1/RELEASE_X86_64 x86_6 import locale locale.setlocale(locale.LC_ALL, ) 'C/UTF-8/C/C/C/C' locale.getlocale() Traceback (most recent call last): File stdin, line 1, in module File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 515, in getlocale return _parse_localename(localename) File /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/locale.py, line 428, in _parse_localename raise ValueError, 'unknown locale: %s' % localename ValueError: unknown locale: UTF-8 # below another configuration (no hackintosh) conda 2.7: Python 2.7.6 |Continuum Analytics, Inc.| (default, Jan 10 2014, 11:23:15) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type help, copyright, credits or license for more information. import locale locale.setlocale(locale.LC_ALL, ) 'C' locale.getlocale() (None, None) conda 3.3: Python 3.3.5 |Continuum Analytics, Inc.| (default, Mar 10 2014, 11:22:25) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type help, copyright, credits or license for more information. import locale locale.setlocale(locale.LC_ALL, ) 'C' locale.getlocale() (None, None) Regular 2.7: Python 2.7.5 (default, Aug 25 2013, 00:04:04) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin Type help, copyright, credits or license for more information. import locale locale.setlocale(locale.LC_ALL, ) 'C' locale.getlocale() (None, None) Regular 3.3 (broken installation??) Python 3.3.2 (v3.3.2:d047928ae3f6, May 13 2013, 13:52:24) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type help, copyright, credits or license for more information. import locale locale.setlocale(locale.LC_ALL, ) Segmentation fault: 11 ### finally, the expected result (on Linux) antonia@antonia-HP-2133 ~ $ uname -a Linux antonia-HP-2133 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:32:08 UTC 2012 i686 i686 i686 GNU/Linux Python 2.7.3 (default, Feb 27 2014, 19:39:10) [GCC 4.7.2] on linux2 import locale locale.setlocale(locale.LC_ALL, ) 'en_US.UTF-8' locale.getlocale() ('en_US', 'UTF-8') -- assignee: ronaldoussoren components: Macintosh messages: 214260 nosy: albertjan, ronaldoussoren priority: normal severity: normal status: open title: setlocale, getlocale succession -- ValueError or (None, None) type: behavior versions: Python 2.7, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue20999 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17254] add thai encoding aliases to encodings.aliases
New submission from albertjan: This is almost identical to: http://bugs.python.org/issue854511 However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following: import encodings aliases = encodings.aliases.aliases more_aliases = {'ibm874' : 'cp874', 'iso_8859_11': 'cp874', 'iso8859_11' : 'cp874', 'windows_874': 'cp874', } aliases.update(more_aliases) -- messages: 182489 nosy: fo...@yahoo.com priority: normal severity: normal status: open title: add thai encoding aliases to encodings.aliases ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17254 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17254] add thai encoding aliases to encodings.aliases
albertjan added the comment: Hi, I found this report that includes your name: http://mail.python.org/pipermail/python-bugs-list/2004-August/024564.html Other relevant websites: http://en.wikipedia.org/wiki/ISO/IEC_8859-11 # is wikipedia 'proof'? http://code.ohloh.net/file?fid=dhX2dJrRWGISzQAijawMU6qzWJQcid=YD58Y-grdtEs=browser=Default http://msdn.microsoft.com/en-us/goglobal/cc305142.aspx http://www.iso.org/iso/catalogue_detail?csnumber=28263 # non-free Regards, Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ - Original Message - From: Marc-Andre Lemburg rep...@bugs.python.org To: fo...@yahoo.com Cc: Sent: Wednesday, February 20, 2013 1:22 PM Subject: [issue17254] add thai encoding aliases to encodings.aliases Marc-Andre Lemburg added the comment: On 20.02.2013 12:48, albertjan wrote: New submission from albertjan: This is almost identical to: http://bugs.python.org/issue854511 However, tis602, which is mentioned in the orginal bug report, is not an alias to cp874. Therefore, I propose the following: import encodings aliases = encodings.aliases.aliases more_aliases = {'ibm874' : 'cp874', 'iso_8859_11': 'cp874', 'iso8859_11' : 'cp874', 'windows_874': 'cp874', } aliases.update(more_aliases) Please provide evidence that those encodings are indeed the same. Thanks, -- Marc-Andre Lemburg eGenix.com -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17254 ___ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17254 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17254] add thai encoding aliases to encodings.aliases
albertjan added the comment: Sent: Wednesday, February 20, 2013 4:25 PM Subject: [issue17254] add thai encoding aliases to encodings.aliases Thanks. Something is wrong with your request, though: * we already have an iso8859_11 code, so aliasing it to some other name is not possible * we already have an cp874 code, so aliasing it to some other name is not possible * cp874 differs from iso8859_11 in a few places, so aliasing cp874 is not possible (see http://en.wikipedia.org/wiki/ISO/IEC_8859-11#Code_page_874) Sorry about that. What we could do is add aliases 'x-ibm874' and 'windows_874' to 'cp874'. I'm not sure whether 'ibm874' and 'x-ibm874' are the same thing. The references only mention 'x-ibm874'. The following document says the following are aliases: x-IBM874, cp874, ibm874, ibm-874, 874 http://www.java2s.com/Tutorial/Java/0180__File/DisplaysAvailableCharsetsandaliases.htm http://www.fileformat.info/info/charset/x-IBM874/index.htm In addition it seems that 'windows_874' is used (that's the one that raised this issue for me), but I've also seen references of windows-874, windows874 , WIN874: http://doxygen.postgresql.org/encnames_8c_source.html -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue17254 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com