On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:
On 16.03.16 08:03, Serhiy Storchaka wrote:
On 15.03.16 22:30, Guido van Rossum wrote:
I came across a file that had two different coding cookies -- one on
the first line and one on the second. CPython uses the first, but mypy
happens to use the second. I couldn't find anything in the spec or
docs ruling out the second interpretation. Does anyone have a
suggestion (apart from following CPython)?

Reference: https://github.com/python/mypy/issues/1281

There is similar question. If a file has two different coding cookies on
the same line, what should win? Currently the last cookie wins, in
CPython parser, in the tokenize module, in IDLE, and in number of other
code. I think this is a bug.

I just tested with Emacs, and it looks that when specify different codings on two different lines, the first coding wins, but when specify different codings on the same line, the last coding wins.

Therefore current CPython behavior can be correct, and the regular expression in PEP 263 should be changed to use greedy repetition.

Just because emacs works that way (and even though I'm an emacs user), that doesn't mean CPython should act like emacs.

(1) CPython should not necessarily act like emacs, unless the coding syntax exactly matches emacs, rather than the generic coding that CPython interprets, that matches emacs, vim, and other similar things that both emacs and vim would ignore. (1a) Maybe if a similar test were run on vim with its syntax, and it also works the same way, then one might think it is a trend worth following, but it is not clear to this non-vim user that vim syntax allows more than one coding specification per line.

(2) emacs has no requirement that the coding be placed on the first two lines. It specifically looks at the second line only if the first line has a “ #! ” or a “ '\" ” (for troff). (according to docs, not experimentation)

(3) emacs also allows for Local Variables to be specified at the end of the file. If CPython were really to act like emacs, then it would need to allow for that too.

(4) there is no benefit to specifying the coding twice on a line, it only adds confusion, whether in CPython, emacs, or vim. (4a) Here's an untested line that emacs would interpret as utf-8, and CPython with the greedy regulare expression would interpret as latin-1, because emacs looks only between the -*- pair, and CPython ignores that.
  # -*- coding: utf-8 -*- this file does not use coding: latin-1
Python-Dev mailing list

Reply via email to