On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:
On 16.03.16 08:03, Serhiy Storchaka wrote:
On 15.03.16 22:30, Guido van Rossum wrote:
I came across a file that had two different coding cookies -- one on
the first line and one on the second. CPython uses the first, but mypy
happens to use the second. I couldn't find anything in the spec or
docs ruling out the second interpretation. Does anyone have a
suggestion (apart from following CPython)?
There is similar question. If a file has two different coding cookies on
the same line, what should win? Currently the last cookie wins, in
CPython parser, in the tokenize module, in IDLE, and in number of other
code. I think this is a bug.
I just tested with Emacs, and it looks that when specify different
codings on two different lines, the first coding wins, but when
specify different codings on the same line, the last coding wins.
Therefore current CPython behavior can be correct, and the regular
expression in PEP 263 should be changed to use greedy repetition.
Just because emacs works that way (and even though I'm an emacs user),
that doesn't mean CPython should act like emacs.
(1) CPython should not necessarily act like emacs, unless the coding
syntax exactly matches emacs, rather than the generic coding that
CPython interprets, that matches emacs, vim, and other similar things
that both emacs and vim would ignore.
(1a) Maybe if a similar test were run on vim with its syntax, and it
also works the same way, then one might think it is a trend worth
following, but it is not clear to this non-vim user that vim syntax
allows more than one coding specification per line.
(2) emacs has no requirement that the coding be placed on the first two
lines. It specifically looks at the second line only if the first line
has a “ #! ” or a “ '\" ” (for troff). (according to docs, not
(3) emacs also allows for Local Variables to be specified at the end of
the file. If CPython were really to act like emacs, then it would need
to allow for that too.
(4) there is no benefit to specifying the coding twice on a line, it
only adds confusion, whether in CPython, emacs, or vim.
(4a) Here's an untested line that emacs would interpret as utf-8, and
CPython with the greedy regulare expression would interpret as latin-1,
because emacs looks only between the -*- pair, and CPython ignores that.
# -*- coding: utf-8 -*- this file does not use coding: latin-1
Python-Dev mailing list