I'm approaching this from the premise that we would like to avoid
needless surprises for users not versed in text encoding. I did a simple
experiment with notepad on Windows 7 as if a naïve user. If I write the
one-line program:
print("Hello world.") # by Jeff
It runs, no surprise.
We may
In a message of Sun, 15 Nov 2015 12:56:18 +, Paul Moore writes:
>On 15 November 2015 at 07:23, Stephen J. Turnbull wrote:
>> I don't see any good reason for allowing non-ASCII-compatible
>> encodings in the reference CPython interpreter.
>
>>From PEP 263:
>
> Any
"Stephen J. Turnbull" writes:
> I don't see any good reason for allowing non-ASCII-compatible
> encodings in the reference CPython interpreter.
There might be a case for having the tokenizer not care about encodings
at all and just operate on a stream of unicode characters
On 15 November 2015 at 07:23, Stephen J. Turnbull wrote:
> I don't see any good reason for allowing non-ASCII-compatible
> encodings in the reference CPython interpreter.
>From PEP 263:
Any encoding which allows processing the first two lines in the
way
On 15 November 2015 at 16:40, Stephen J. Turnbull wrote:
> What PEP 263 did do was to specify that non-ASCII-compatible encodings
> are not supported by the PEP 263 mechanism for declaring the encoding
> of a Python source program. That's because it looks for a "magic
>
> On Nov 15, 2015, at 9:34 AM, Guido van Rossum wrote:
>
> Let me just unilaterally end this discussion. It's fine to disregard
> the future possibility of using UTF-16 or -32 for Python source code.
> Serhiy can happily rip out any comments or dead code dealing with that
>
Random832 writes:
> "Stephen J. Turnbull" writes:
> > I don't see any good reason for allowing non-ASCII-compatible
> > encodings in the reference CPython interpreter.
>
> There might be a case for having the tokenizer not care about encodings
> at all and just operate
On 14.11.2015 23:56, Victor Stinner wrote:
> These encodings are rarely used. I don't think that any text editor use
> them. Editors use ascii, latin1, utf8 and... all locale encoding. But I
> don't know any OS using UTF-16 as a locale encoding. UTF-32 wastes disk
> space.
UTF-16 is used a lot
Let me just unilaterally end this discussion. It's fine to disregard
the future possibility of using UTF-16 or -32 for Python source code.
Serhiy can happily rip out any comments or dead code dealing with that
possibility.
--
--Guido van Rossum (python.org/~guido)
Laura Creighton writes:
> Steve Turnbull, who lives in Japan, and speaks and writes Japanese
> is saying that "he cannot see any reason for allowing non-ASCII
> compatible encodings in Cpython".
>
> This makes me wonder.
>
> Is this along the lines of 'even in Japan we do not want such
On Sun, Nov 15, 2015 at 12:47 PM, Glenn Linderman wrote:
> On 11/14/2015 5:37 PM, Chris Angelico wrote:
>
> On Sun, Nov 15, 2015 at 12:27 PM, Glenn Linderman
> wrote:
>
> Notepad defaults to ANSI encoding, as I think it always has. UTF-8 is an
>
On Sat, Nov 14, 2015 at 7:06 PM, Steve Dower wrote:
> The native encoding on Windows has been UTF-16 since Windows NT. Obviously
> we've survived without Python tokenization support for a long time, but
> every API uses it.
Windows 2000 was the first version to have broad
On Sat, Nov 14, 2015 at 7:15 PM, Chris Angelico wrote:
> Can the py.exe launcher handle a UTF-16 shebang? (I'm pretty sure Unix
> program loaders won't.) That alone might be a reason for strongly
> encouraging ASCII-compat encodings.
The launcher supports shebangs encoded as
Steve Dower writes:
> Saying [UTF-16] is rarely used is rather exposing your own
> unawareness though - it could arguably be the most commonly used
> encoding (depending on how you define "used").
Because we're discussing the storage of .py files, the relevant
definition is the one used by
Glenn Linderman writes:
> On 11/14/2015 5:37 PM, Chris Angelico wrote:
> > Thanks. Is "ANSI" always an eight-bit ASCII-compatible encoding?
>
> I wouldn't trust an answer to this question that didn't come from
> someone that used Windows with Chinese, Japanese, or Korean,
On Sat, Nov 14, 2015 at 09:19:37PM +0200, Serhiy Storchaka wrote:
> If the support of UTF-16 and UTF-32 is planned, I'll take this to
> attention during refactoring. But in many places besides the tokenizer
> the ASCII compatible encoding of source files is expected.
Perhaps another way of
For now UTF-16 and UTF-32 source encodings are not supported. There is a
comment in Parser/tokenizer.c:
/* Disable support for UTF-16 BOMs until a decision
is made whether this needs to be supported. */
Can we make a decision whether this support will be added in foreseeable
These encodings are rarely used. I don't think that any text editor use
them. Editors use ascii, latin1, utf8 and... all locale encoding. But I
don't know any OS using UTF-16 as a locale encoding. UTF-32 wastes disk
space.
Ok, even if it exists, Python already accepts a very wide range of
I agree that supporting UTF-16 doesn't seem terribly useful. Also, thank
you for giving the tokenizer some love!
On Sat, Nov 14, 2015, at 11:19, Serhiy Storchaka wrote:
> For now UTF-16 and UTF-32 source encodings are not supported. There is a
> comment in Parser/tokenizer.c:
>
> /*
On 15.11.15 00:56, Victor Stinner wrote:
These encodings are rarely used. I don't think that any text editor use
them. Editors use ascii, latin1, utf8 and... all locale encoding. But I
don't know any OS using UTF-16 as a locale encoding. UTF-32 wastes disk
space.
AFAIK the standard Windows
On 11/14/2015 3:21 PM, Serhiy Storchaka wrote:
On 15.11.15 00:56, Victor Stinner wrote:
These encodings are rarely used. I don't think that any text editor use
them. Editors use ascii, latin1, utf8 and... all locale encoding. But I
don't know any OS using UTF-16 as a locale encoding. UTF-32
Victor Stinner writes:
> These encodings are rarely used. I don't think that any text editor
> use them.
MS Windows' Notepad can be made to use UTF-16.
___
Python-Dev mailing list
Python-Dev@python.org
stin...@gmail.com>
Sent: 11/14/2015 14:58
To: "Serhiy Storchaka" <storch...@gmail.com>
Cc: "python-dev@python.org" <python-dev@python.org>
Subject: Re: [Python-Dev] Support of UTF-16 and UTF-32 source encodings
These encodings are rarely used. I don't think that
On Sun, Nov 15, 2015 at 12:06 PM, Steve Dower wrote:
> The native encoding on Windows has been UTF-16 since Windows NT. Obviously
> we've survived without Python tokenization support for a long time, but
> every API uses it.
>
> I've hit a few cases where it would have
On 11/14/2015 5:15 PM, Chris Angelico wrote:
Can the py.exe launcher handle a UTF-16 shebang? (I'm pretty sure Unix
program loaders won't.) That alone might be a reason for strongly
encouraging ASCII-compat encodings.
That raises an interesting question about if py.exe can handle a leading
Chris Angelico writes:
> Can the py.exe launcher handle a UTF-16 shebang? (I'm pretty sure Unix
> program loaders won't.)
A lot of them can't handle UTF-8 with a BOM, either.
> That alone might be a reason for strongly encouraging ASCII-compat
> encodings.
A "python" or
On 11/14/2015 5:15 PM, Chris Angelico wrote:
I think even Notepad defaults to UTF-8 for
files, now.
Just installed Windows 10 on a new machine, and upgraded to the latest
Windows 10 release, 1511.
Notepad defaults to ANSI encoding, as I think it always has. UTF-8 is
an option, and it does
On Sun, Nov 15, 2015 at 12:27 PM, Glenn Linderman wrote:
> Notepad defaults to ANSI encoding, as I think it always has. UTF-8 is an
> option, and it does seem to try to notice the original encoding of the file,
> when editing old files, but when creating a new one
On 11/14/2015 5:37 PM, Chris Angelico wrote:
On Sun, Nov 15, 2015 at 12:27 PM, Glenn Linderman wrote:
Notepad defaults to ANSI encoding, as I think it always has. UTF-8 is an
option, and it does seem to try to notice the original encoding of the file,
when editing old
29 matches
Mail list logo