[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
Martin v. Löwis mar...@v.loewis.de added the comment: This is not a bug, but intentional. a is a Unicode string; it does not have an encoding internally (not GBK, not UTF-8). Then, the string being exec'ed also becomes a Unicode string. exec'ing Unicode strings is confusing; try to avoid this. The semantics of exec'ing a Unicode string is that all str (but not unicode) literals get encoded as UTF-8. To see the result you expect, write a = 麓贸 -- nosy: +loewis resolution: - invalid status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
Martin v. Löwis mar...@v.loewis.de added the comment: Oops, I meant a = 大 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
wjm251 wjm...@gmail.com added the comment: but why it is forced to encoded to utf-8, I think it should be encoded by the locale related encodings,not always utf-8, for example,in GBK locale,it should use GBK to encode the unicode object,right? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
Martin v. Löwis mar...@v.loewis.de added the comment: but why it is forced to encoded to utf-8, I think it should be encoded by the locale related encodings,not always utf-8, for example,in GBK locale,it should use GBK to encode the unicode object,right? Wrong. Exec'ing Unicode strings has been specified to encode all strings as UTF-8. This cannot be changed anymore. Even if this was possible to change, it should *not* use the locale encoding. The source encoding and the locale encoding are independent; the source encoding is normally determined from PEP 263 declarations. So if anything, exec'ing Unicode strings should use an encoding declaration that you have in that string. However, you don't have one, and they are unsupported for Unicode strings, anyway. -- title: exec encode unicode to utf-8 str automatically in GBK environment - exec encode unicode to utf-8 str automatically in GBK environment ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
wjm251 wjm...@gmail.com added the comment: oh,you mentioned the PEP 263 but I already set a header like this,you can see the attached test.py #coding=GBK why exec choose to use utf-8 not GBK? GBK is a valid Chinese character set in python26 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
Martin v. Löwis mar...@v.loewis.de added the comment: oh,you mentioned the PEP 263 but I already set a header like this,you can see the attached test.py #coding=GBK You have that in test.py, but not in the string you are giving to exec. This is really a separate source code. So you could have written exec '''#coding:GBK print hi('%s') ''' % a You didn't, hence the code you pass to exec has no declared source encoding. why exec choose to use utf-8 not GBK? exec always choses UTF-8 when exec'ing Unicode strings. The source encoding of the file that has the exec statement must be irrelevant: the string being exec'ed may have been received from a different source file. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
wjm251 wjm...@gmail.com added the comment: Got that , thank you -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
New submission from wjm251 wjm...@gmail.com: windows Xp chinese version see the attached file, the header was set to GBK,and the file is GBK encoded, but why the output was '\xe5\xa4\xa7'(it is utf-8 encoded of Chinese character 大) -- components: Library (Lib) files: test.py messages: 119482 nosy: wjm251 priority: normal severity: normal status: open title: exec encode unicode to utf-8 str automatically in GBK environment type: behavior versions: Python 2.6 Added file: http://bugs.python.org/file19349/test.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10187] exec encode unicode to utf-8 str automatically in GBK environment
wjm251 wjm...@gmail.com added the comment: in windows English Version and ubuntu 10.04(locale is utf-8) all have the same the behavior, am I wrong? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10187 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com