Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Nick Coghlan
On 12 May 2015 at 06:38, Glenn Linderman wrote: > On 5/11/2015 1:09 AM, Nick Coghlan wrote: > On 10 May 2015 at 23:28, Adam Bartoš wrote: > I'd love to see it included in 3.5, but I doubt that will happen. For one > thing, it's only two weeks till beta 1, which is feature freeze. And mainly, > my

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Glenn Linderman
On 5/11/2015 1:09 AM, Nick Coghlan wrote: On 10 May 2015 at 23:28, Adam Bartoš wrote: Glenn Linderman wrote: Is this going to get released in 3.5, I hope? Python 3 is pretty limited without some solution for Unicode on the console... probably the biggest deficiency I have found in Python 3, s

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-11 Thread Nick Coghlan
On 10 May 2015 at 23:28, Adam Bartoš wrote: > Glenn Linderman wrote: >> Is this going to get released in 3.5, I hope? Python 3 is pretty >> limited without some solution for Unicode on the console... probably the >> biggest deficiency I have found in Python 3, since its introduction. It >> has gr

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-10 Thread Adam Bartoš
Glenn Linderman wrote: > Is this going to get released in 3.5, I hope? Python 3 is pretty > limited without some solution for Unicode on the console... probably the > biggest deficiency I have found in Python 3, since its introduction. It > has great Unicode support for files and processing, which

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-09 Thread Glenn Linderman
On 5/9/2015 5:39 AM, Adam Bartoš wrote: I already have a solution in Python 3 (see https://github.com/Drekin/win-unicode-console, https://pypi.python.org/pypi/win_unicode_console), I was just considering adding support for Python 2 as well. I think I have an working example in Python 2 using c

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-09 Thread Adam Bartoš
I already have a solution in Python 3 (see https://github.com/Drekin/win-unicode-console, https://pypi.python.org/pypi/win_unicode_console), I was just considering adding support for Python 2 as well. I think I have an working example in Python 2 using ctypes. On Thu, May 7, 2015 at 9:23 PM, "Mart

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-07 Thread Martin v. Löwis
Am 02.05.15 um 21:57 schrieb Adam Bartoš: > Even if sys.stdin contained a file-like object with proper encoding > attribute, it wouldn't work since sys.stdin has to be instance of 'file'>. So the question is, whether it is possible to make a file instance > in Python that is also customizable so i

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-02 Thread Adam Bartoš
I think I have found out where the problem is. In fact, the encoding of the interactive input is determined by sys.stdin.encoding, but only in the case that it is a file object (see https://hg.python.org/cpython/file/d356e68de236/Parser/tokenizer.c#l890 and the implementation of tok_stdin_decode).

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-02 Thread Stephen J. Turnbull
Adam Bartoš writes: > I'll describe my picture of the situation, which might be terribly wrong. > On Linux, in a typical situation, we have a UTF-8 terminal, > PYTHONENIOENCODING=utf-8, GNU readline is used. When the REPL wants input > from a user the tokenizer calls PyOS_Readline, which calls

Re: [Python-Dev] Unicode literals in Python 2.7

2015-05-01 Thread Adam Bartoš
On Fri, May 1, 2015 at 6:14 AM, Stephen J. Turnbull wrote: > Adam Bartoš writes: > > > Unfortunately, it doesn't work. With PYTHONIOENCODING=utf-8, the > > sys.std* streams are created with utf-8 encoding (which doesn't > > help on Windows since they still don't use ReadConsoleW and > > Write

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Stephen J. Turnbull
Adam Bartoš writes: > Unfortunately, it doesn't work. With PYTHONIOENCODING=utf-8, the > sys.std* streams are created with utf-8 encoding (which doesn't > help on Windows since they still don't use ReadConsoleW and > WriteConsoleW to communicate with the terminal) and after changing > the sys

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Adam Bartoš
> does this not work for you? > > from __future__ import unicode_literals No, with unicode_literals I just don't have to use the u'' prefix, but the wrong interpretation persists. On Thu, Apr 30, 2015 at 3:03 AM, Stephen J. Turnbull wrote: > > IIRC, on the Linux console and in an uxterm, PYTHO

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Stephen J. Turnbull
Chris Angelico writes: > It's legal Unicode, but it doesn't mean what he typed in. Of course, that's obvious. My point is "Welcome to the wild wacky world of soi-disant 'internationalized' software, where what you see is what you get regardless of what you type."

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-30 Thread Alexander Walters
does this not work for you? from __future__ import unicode_literals On 4/28/2015 16:20, Adam Bartoš wrote: Hello, is it possible to somehow tell Python 2.7 to compile a code entered in the interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering adding support for Python 2

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Chris Angelico
On Thu, Apr 30, 2015 at 11:03 AM, Stephen J. Turnbull wrote: > Note that even if you have a UTF-8 input source, some users are likely > to be surprised because IIRC Python doesn't canonicalize in its > codecs; that is left for higher-level libraries. Linux UTF-8 is > usually NFC normalized, while

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Stephen J. Turnbull
Adam Bartoš writes: > I am in Windows and my terminal isn't utf-8 at the beginning, but I > install custom sys.std* objects at runtime and I also install > custom readline hook, IIRC, on the Linux console and in an uxterm, PYTHONIOENCODING=utf-8 in the environment does what you want. (Can't t

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
I am in Windows and my terminal isn't utf-8 at the beginning, but I install custom sys.std* objects at runtime and I also install custom readline hook, so the interactive loop gets the input from my stream objects via PyOS_Readline. So when I enter u'α', the tokenizer gets b"u'\xce\xb1'", which is

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Oleg Broytman
On Wed, Apr 29, 2015 at 09:40:43AM -0700, Guido van Rossum wrote: > I suspect the interactive session is *not* always in UTF8. It probably > depends on the keyboard mapping of your terminal emulator. I imagine in > Windows it's the current code page. Even worse: in w32 it can be an OEM codepa

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Guido van Rossum
I suspect the interactive session is *not* always in UTF8. It probably depends on the keyboard mapping of your terminal emulator. I imagine in Windows it's the current code page. On Wed, Apr 29, 2015 at 9:19 AM, Adam Bartoš wrote: > Yes, that works for eval. But I want it for code entered during

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
Yes, that works for eval. But I want it for code entered during an interactive session. >>> u'α' u'\xce\xb1' The tokenizer gets b"u'\xce\xb1'" by calling PyOS_Readline and it knows it's utf-8 encoded. But the result of evaluation is u'\xce\xb1'. Because of how eval works, I believe that it would

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Victor Stinner
Le 29 avr. 2015 10:36, "Adam Bartoš" a écrit : > Why I'm talking about PyCF_SOURCE_IS_UTF8? eval(u"u'\u03b1'") -> u'\u03b1' but eval(u"u'\u03b1'".encode('utf-8')) -> u'\xce\xb1'. There is a simple option to get this flag: call eval() with unicode, not with encoded bytes. Victor _

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Adam Bartoš
This situation is a bit different from coding cookies. They are used when we have bytes from a source file, but we don't know its encoding. During interactive session the tokenizer always knows the encoding of the bytes. I would think that in the case of interactive session the PyCF_SOURCE_IS_UTF8

Re: [Python-Dev] Unicode literals in Python 2.7

2015-04-29 Thread Nick Coghlan
On 29 April 2015 at 06:20, Adam Bartoš wrote: > Hello, > > is it possible to somehow tell Python 2.7 to compile a code entered in the > interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering > adding support for Python 2 in my package > (https://github.com/Drekin/win-unicode-co

[Python-Dev] Unicode literals in Python 2.7

2015-04-28 Thread Adam Bartoš
Hello, is it possible to somehow tell Python 2.7 to compile a code entered in the interactive session with the flag PyCF_SOURCE_IS_UTF8 set? I'm considering adding support for Python 2 in my package ( https://github.com/Drekin/win-unicode-console) and I have run into the fact that when u"α" is ent