Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Nick Coghlan
On 10 May 2015 at 23:28, Adam Bartoš dre...@gmail.com wrote: Glenn Linderman wrote: Is this going to get released in 3.5, I hope? Python 3 is pretty limited without some solution for Unicode on the console... probably the biggest deficiency I have found in Python 3, since its introduction. It

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Glenn Linderman
On 5/11/2015 1:09 AM, Nick Coghlan wrote: On 10 May 2015 at 23:28, Adam Bartoš dre...@gmail.com wrote: Glenn Linderman wrote: Is this going to get released in 3.5, I hope? Python 3 is pretty limited without some solution for Unicode on the console... probably the biggest deficiency I have

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Nick Coghlan
On 12 May 2015 at 06:38, Glenn Linderman v+pyt...@g.nevcal.com wrote: On 5/11/2015 1:09 AM, Nick Coghlan wrote: On 10 May 2015 at 23:28, Adam Bartoš dre...@gmail.com wrote: I'd love to see it included in 3.5, but I doubt that will happen. For one thing, it's only two weeks till beta 1, which

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-10 Thread Adam Bartoš
Glenn Linderman wrote: Is this going to get released in 3.5, I hope? Python 3 is pretty limited without some solution for Unicode on the console... probably the biggest deficiency I have found in Python 3, since its introduction. It has great Unicode support for files and processing, which

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-09 Thread Adam Bartoš
I already have a solution in Python 3 (see https://github.com/Drekin/win-unicode-console, https://pypi.python.org/pypi/win_unicode_console), I was just considering adding support for Python 2 as well. I think I have an working example in Python 2 using ctypes. On Thu, May 7, 2015 at 9:23 PM,

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-09 Thread Glenn Linderman
On 5/9/2015 5:39 AM, Adam Bartoš wrote: I already have a solution in Python 3 (see https://github.com/Drekin/win-unicode-console, https://pypi.python.org/pypi/win_unicode_console), I was just considering adding support for Python 2 as well. I think I have an working example in Python 2 using

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-07 Thread Martin v. Löwis
Am 02.05.15 um 21:57 schrieb Adam Bartoš: Even if sys.stdin contained a file-like object with proper encoding attribute, it wouldn't work since sys.stdin has to be instance of type 'file'. So the question is, whether it is possible to make a file instance in Python that is also customizable so

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-02 Thread Stephen J. Turnbull
Adam Bartoš writes: I'll describe my picture of the situation, which might be terribly wrong. On Linux, in a typical situation, we have a UTF-8 terminal, PYTHONENIOENCODING=utf-8, GNU readline is used. When the REPL wants input from a user the tokenizer calls PyOS_Readline, which calls

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-02 Thread Adam Bartoš
I think I have found out where the problem is. In fact, the encoding of the interactive input is determined by sys.stdin.encoding, but only in the case that it is a file object (see https://hg.python.org/cpython/file/d356e68de236/Parser/tokenizer.c#l890 and the implementation of tok_stdin_decode).

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-01 Thread Adam Bartoš
On Fri, May 1, 2015 at 6:14 AM, Stephen J. Turnbull step...@xemacs.org wrote: Adam Bartoš writes: Unfortunately, it doesn't work. With PYTHONIOENCODING=utf-8, the sys.std* streams are created with utf-8 encoding (which doesn't help on Windows since they still don't use ReadConsoleW and

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Stephen J. Turnbull
Chris Angelico writes: It's legal Unicode, but it doesn't mean what he typed in. Of course, that's obvious. My point is Welcome to the wild wacky world of soi-disant 'internationalized' software, where what you see is what you get regardless of what you type.

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Alexander Walters
does this not work for you? from __future__ import unicode_literals On 4/28/2015 16:20, Adam Bartoš wrote: Hello, is it possible to somehow tell Python 2.7 to compile a code entered in the interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering adding support for Python

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Stephen J. Turnbull
Adam Bartoš writes: Unfortunately, it doesn't work. With PYTHONIOENCODING=utf-8, the sys.std* streams are created with utf-8 encoding (which doesn't help on Windows since they still don't use ReadConsoleW and WriteConsoleW to communicate with the terminal) and after changing the

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Adam Bartoš
does this not work for you? from __future__ import unicode_literals No, with unicode_literals I just don't have to use the u'' prefix, but the wrong interpretation persists. On Thu, Apr 30, 2015 at 3:03 AM, Stephen J. Turnbull step...@xemacs.org wrote: IIRC, on the Linux console and in an

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Nick Coghlan
On 29 April 2015 at 06:20, Adam Bartoš dre...@gmail.com wrote: Hello, is it possible to somehow tell Python 2.7 to compile a code entered in the interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering adding support for Python 2 in my package

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
This situation is a bit different from coding cookies. They are used when we have bytes from a source file, but we don't know its encoding. During interactive session the tokenizer always knows the encoding of the bytes. I would think that in the case of interactive session the PyCF_SOURCE_IS_UTF8

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Guido van Rossum
I suspect the interactive session is *not* always in UTF8. It probably depends on the keyboard mapping of your terminal emulator. I imagine in Windows it's the current code page. On Wed, Apr 29, 2015 at 9:19 AM, Adam Bartoš dre...@gmail.com wrote: Yes, that works for eval. But I want it for

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
Yes, that works for eval. But I want it for code entered during an interactive session. u'α' u'\xce\xb1' The tokenizer gets bu'\xce\xb1' by calling PyOS_Readline and it knows it's utf-8 encoded. But the result of evaluation is u'\xce\xb1'. Because of how eval works, I believe that it would work

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
I am in Windows and my terminal isn't utf-8 at the beginning, but I install custom sys.std* objects at runtime and I also install custom readline hook, so the interactive loop gets the input from my stream objects via PyOS_Readline. So when I enter u'α', the tokenizer gets bu'\xce\xb1', which is

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Oleg Broytman
On Wed, Apr 29, 2015 at 09:40:43AM -0700, Guido van Rossum gu...@python.org wrote: I suspect the interactive session is *not* always in UTF8. It probably depends on the keyboard mapping of your terminal emulator. I imagine in Windows it's the current code page. Even worse: in w32 it can be

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Stephen J. Turnbull
Adam Bartoš writes: I am in Windows and my terminal isn't utf-8 at the beginning, but I install custom sys.std* objects at runtime and I also install custom readline hook, IIRC, on the Linux console and in an uxterm, PYTHONIOENCODING=utf-8 in the environment does what you want. (Can't

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Chris Angelico
On Thu, Apr 30, 2015 at 11:03 AM, Stephen J. Turnbull step...@xemacs.org wrote: Note that even if you have a UTF-8 input source, some users are likely to be surprised because IIRC Python doesn't canonicalize in its codecs; that is left for higher-level libraries. Linux UTF-8 is usually NFC

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Victor Stinner
Le 29 avr. 2015 10:36, Adam Bartoš dre...@gmail.com a écrit : Why I'm talking about PyCF_SOURCE_IS_UTF8? eval(uu'\u03b1') - u'\u03b1' but eval(uu'\u03b1'.encode('utf-8')) - u'\xce\xb1'. There is a simple option to get this flag: call eval() with unicode, not with encoded bytes. Victor

[Python-Dev] Unicode literals in Python 2.7

2015-04-28 Thread Adam Bartoš
Hello, is it possible to somehow tell Python 2.7 to compile a code entered in the interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering adding support for Python 2 in my package ( https://github.com/Drekin/win-unicode-console) and I have run into the fact that when uα is