Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 17.03.16 15:14, M.-A. Lemburg wrote:

On 17.03.2016 01:29, Guido van Rossum wrote:

Should we recommend that everyone use tokenize.detect_encoding()?


I'd prefer a separate utility for this somewhere, since
tokenize.detect_encoding() is not available in Python 2.

I've attached an example implementation with tests, which works
in Python 2.7 and 3.


Sorry, but this code doesn't match the behaviour of Python interpreter, 
nor other tools. I suggest to backport tokenize.detect_encoding() (but 
be aware that the default encoding in Python 2 is ASCII, not UTF-8).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 484 update: add Type[T]

2016-03-19 Thread Guido van Rossum
There's a more fundamental PEP 484 update that I'd like to add. The
discussion is in https://github.com/python/typing/issues/107.

Currently we don't have a way to talk about arguments and variables
whose type is itself a type or class. The only annotation you can use
for this is 'type' which says "this argument/variable is a type
object" (or a class). But it's often useful to be able to say "this is
a class and it must be a subclass of X".

In fact this was proposed in the original rounds of discussion about
PEP 484, but at the time it felt too far removed from practice to know
quite how it should be used, so I just put it off. But it's been one
of the features that's been requested most by the early adopters of
PEP 484 at Dropbox. So I'd like to add it now.

At runtime this shouldn't do much; Type would be just a generic class
of one parameter that records its one type parameter. The real magic
would happen in the type checker, which will be able to use types
involving Type. It should also be possible to use this with type
variables, so we could write e.g.

T = TypeVar('T', bound=int)
def factory(c: Type[T]) -> T:


This would define factory() as a function whose argument must be a
subclass of int and returning an instance of that subclass. (The
bound= option to TypeVar() is already described in PEP 484, although
mypy hasn't implemented it yet.)

(If I screwed up this example, hopefully Jukka will correct me. :-)

Again, I'd like this to go out with 3.5.2, because it requires adding
something to typing.py (and again, that's allowed because PEP 484 is
provisional -- see PEP 411 for an explanation).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 484: updates to Python 2.7 signature syntax

2016-03-19 Thread Guido van Rossum
Heh. I could add an example with a long list of parameters with long
names, but apart from showing by example what the motivation is it
wouldn't really add anything, and it's more to type. :-)

On Sat, Mar 19, 2016 at 6:43 PM, Andrew Barnert  wrote:
> On Mar 19, 2016, at 18:18, Guido van Rossum  wrote:
>>
>> Second, https://github.com/python/typing/issues/186. This builds on
>> the previous syntax but deals with the other annoyance of long
>> argument lists, this time in case you *do* care about the types. The
>> proposal is to allow writing the arguments one per line with a type
>> comment on each line. This has been implemented in PyCharm but not yet
>> in mypy. Example:
>>
>>def gcd(
>>a,  # type: int
>>b,  # type: int
>>):
>># type: (...) -> int
>>
>
> This is a lot nicer than what you were originally discussing (at #1101? I 
> forget...). Even more so given how trivial it will be to mechanically convert 
> these to annotations if/when you switch an app to pure Python 3.
>
> But one thing: in the PEP and the docs, I think it would be better to pick an 
> example with longer parameter names. This example shows that even in the 
> worst case it isn't that bad, but a better example would show that in the 
> typical case it's actually pretty nice. (Also, I don't see why you wouldn't 
> just use the "old" comment form for this example, since it all fits on one 
> line and isn't at all confusing.)
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 484 update: allow @overload in regular module files

2016-03-19 Thread Guido van Rossum
Here's another proposal for a change to PEP 484.

In https://github.com/python/typing/issues/72 there's a long
discussion ending with a reasonable argument to allow @overload in
(non-stub) modules after all.

This proposal does *not* sneak in a syntax for multi-dispatch -- the
@overload versions are only for the benefit of type checkers while a
single non-@overload implementation must follow that handles all
cases. In fact, I expect that if we ever end up adding multi-dispatch
to the language or library, it will neither replace not compete with
@overload, but the two will most likely be orthogonal to each other,
with @overload aiming at a type checker and some other multi-dispatch
aiming at the interpreter. (The needs of the two cases are just too
different -- e.g. it's hard to imagine multi-dispatch in Python use
type variables.) More details in the issue (that's also where I'd like
to get feedback if possible).

I want to settle this before 3.5.2 goes out, because it requires a
change to typing.py in the stdlib. Fortunately the change will be
backward compatible (even though this isn't strictly required for a
provisional module). In the original typing module, any use of
@overload outside a stub is an error (it raises as soon as it's used).
In the new proposal, you can decorate a function with @overload, but
any attempt to call such a decorated function raises an error. This
should catch cases early where you forget to provide an
implementation.

(Reference for provisional modules: https://www.python.org/dev/peps/pep-0411/)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 484: updates to Python 2.7 signature syntax

2016-03-19 Thread Andrew Barnert via Python-Dev
On Mar 19, 2016, at 18:18, Guido van Rossum  wrote:
> 
> Second, https://github.com/python/typing/issues/186. This builds on
> the previous syntax but deals with the other annoyance of long
> argument lists, this time in case you *do* care about the types. The
> proposal is to allow writing the arguments one per line with a type
> comment on each line. This has been implemented in PyCharm but not yet
> in mypy. Example:
> 
>def gcd(
>a,  # type: int
>b,  # type: int
>):
># type: (...) -> int
>

This is a lot nicer than what you were originally discussing (at #1101? I 
forget...). Even more so given how trivial it will be to mechanically convert 
these to annotations if/when you switch an app to pure Python 3.

But one thing: in the PEP and the docs, I think it would be better to pick an 
example with longer parameter names. This example shows that even in the worst 
case it isn't that bad, but a better example would show that in the typical 
case it's actually pretty nice. (Also, I don't see why you wouldn't just use 
the "old" comment form for this example, since it all fits on one line and 
isn't at all confusing.)

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Glenn Linderman

On 3/16/2016 12:59 AM, Serhiy Storchaka wrote:

On 16.03.16 09:46, Glenn Linderman wrote:

On 3/16/2016 12:09 AM, Serhiy Storchaka wrote:

On 16.03.16 08:34, Glenn Linderman wrote:

 From the PEP 263:


More precisely, the first or second line must match the regular
expression "coding[:=]\s*([-\w.]+)". The first group of this
expression is then interpreted as encoding name. If the encoding
is unknown to Python, an error is raised during compilation. 
There

must not be any Python statement on the line that contains the
encoding declaration.


Clearly the regular expression would only match the first of multiple
cookies on the same line, so the first one should always win... but
there should only be one, from the first PEP quote "a magic comment".


"The first group of this expression" means the first regular
expression group. Only the part between parenthesis "([-\w.]+)" is
interpreted as encoding name, not all expression.


Sure.  But there is no mention anywhere in the PEP of more than one
being legal: just more than one position for it, EITHER line 1 or line
2. So while the regular expression mentioned is not anchored, to allow
variation in syntax between emacs and vim, "must match the regular
expression" doesn't imply "several times", and when searching for a
regular expression that might not be anchored, one typically expects to
find the first.


Actually "must match the regular expression" is not correct, because 
re.match() implies anchoring at the start. I have proposed more 
correct regular expression in other branch of this thread.


"match" doesn't imply anchoring at the start.  "re.match()" does (and as 
a result is very confusing to newbies to Python re, that have used other 
regexp systems).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 484: updates to Python 2.7 signature syntax

2016-03-19 Thread Guido van Rossum
PEP 484 was already updated to support signatures as type comments in
Python 2.7. I'd like to add two more variations to this spec, both of
which have already come up through users.

First, https://github.com/python/typing/issues/188. This augments the
format of signature type comments to allow (...) instead of an
argument list. This is useful to avoid having to write (Any, Any, Any,
..., Any) for a long argument list if you don't care about the
argument types but do want to specify the return type. It's already
implemented by mypy (and presumably by PyCharm). Example:

def gcd(a, b):
# type: (...) -> int


Second, https://github.com/python/typing/issues/186. This builds on
the previous syntax but deals with the other annoyance of long
argument lists, this time in case you *do* care about the types. The
proposal is to allow writing the arguments one per line with a type
comment on each line. This has been implemented in PyCharm but not yet
in mypy. Example:

def gcd(
a,  # type: int
b,  # type: int
):
# type: (...) -> int


In both cases we've considered a few alternatives and ended up
agreeing on the best course forward. If you have questions or feedback
on either proposal it's probably best to just add a comment to the
GitHub tracker issues.

A clarification of the status of PEP 484: it was provisionally
accepted in May 2015. Having spent close to a year pondering it, and
the last several months actively using it at Dropbox, I'm now ready to
move with some improvements based on these experiences (and those of
others who have started to use it). We already added the basic Python
2.7 compatible syntax (see thread starting at
https://mail.python.org/pipermail/python-ideas/2016-January/037704.html),
and having used that for a few months the two proposals mentioned
above handle a few corner cases that were possible but a bit awkward
in our experience.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Make the warnings module extensible

2016-03-19 Thread Victor Stinner
Hi,

I have an API question for you.

I would like to add a new parameter the the showwarning() function of
the warnings module. Problem: it's not possible to do that without
breaking the backward compatibility (when an application replaces
warnings.showwarning(), the warnings allows and promotes that).

I proposed a patch to add a new showmsg() function which takes a
warnings.WarningMessage object:
https://bugs.python.org/issue26568

The design is inspired by the logging module and its logging.LogRecord
class. The warnings.WarningMessage already exists. Since it's class,
it's easy to add new attributes without breaking the API.

- If warnings.showwarning() is replaced by an application, this
function will be called in practice to log the warning.
- If warnings.showmsg() is replaced, again, this function will be
called in practice.
- If both functions are replaced, showmsg() will be called (replaced
showwarning() is ignored)

I'm not sure about function names: showmsg() and formatmsg(). Maybe:
showwarnmsg() and formatwarnmsg()? Bikeshedding fight!

The final goal is to log the traceback where the destroyed object was
allocated when a ResourceWarning warning is logged:
https://bugs.python.org/issue26567

Adding a new parameter to warnings make the implementation much more
simple and gives more freedom to the logger to decide how to format
the warning.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitfields - short - and xlc compiler

2016-03-19 Thread Andrew Barnert via Python-Dev
On Mar 17, 2016, at 18:35, MRAB  wrote:
> 
>> On 2016-03-18 00:56, Michael Felt wrote:
>> Update:
>> Is this going to be impossible?
> From what I've been able to find out, the C89 standard limits bitfields to 
> int, signed int and unsigned int, and the C99 standard added _Bool, although 
> some compilers allow other integer types too. It looks like your compiler 
> doesn't allow those additional types.

Yeah, C99 (6.7.2.1) allows "a qualified or unqualified version of _Bool, signed 
int, unsigned int, or some other implementation-defined type", and same for 
C11. This means that a compiler could easily allow an implementation-defined 
type that's identical to and interconvertible with short, say "i16", to be used 
in bitfields, but not short itself.

And yet, gcc still allows short "even in strictly conforming mode" (4.9), and 
it looks like Clang and Intel do the same. 

Meanwhile, MSVC specifically says it's illegal ("The type-specifier for the 
declarator must be unsigned int, signed int, or int") but then defines the 
semantics (you can't have a 17-bit short, bit fields act as the underlying type 
when accessed, alignment is forced to a boundary appropriate for the underlying 
type). They do mention that allowing char and long types is a Microsoft 
extension, but still nothing about short, even though it's used in most of the 
examples on the page.

Anyway, is the question what ctypes should do? If a platform's compiler allows 
"short M: 1", especially if it has potentially different alignment than "int M: 
1", ctypes on that platform had better make ("M", c_short, 1) match the former, 
right?

So it sounds like you need some configure switch to test that your compiler 
doesn't allow short bit fields, so your ctypes build at least skips that part 
of _ctypes_test.c and test_bitfields.py, and maybe even doesn't allow them in 
Python code.


>> test_short fails om AIX when using xlC in any case. How terrible is this?
>> 
>> ==
>> FAIL: test_shorts (ctypes.test.test_bitfields.C_Test)
>> --
>> Traceback (most recent call last):
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
>> line 48, in test_shorts
>>  self.assertEqual((name, i, getattr(b, name)), (name, i,
>> func(byref(b), name)))
>> AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)
>> 
>> First differing element 2:
>> -1
>> 1
>> 
>> - ('M', 1, -1)
>> ?  -
>> 
>> + ('M', 1, 1)
>> 
>> --
>> Ran 440 tests in 1.538s
>> 
>> FAILED (failures=1, skipped=91)
>> Traceback (most recent call last):
>>File "./Lib/test/test_ctypes.py", line 15, in 
>>  test_main()
>>File "./Lib/test/test_ctypes.py", line 12, in test_main
>>  run_unittest(unittest.TestSuite(suites))
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
>> line 1428, in run_unittest
>>  _run_suite(suite)
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
>> line 1411, in _run_suite
>>  raise TestFailed(err)
>> test.test_support.TestFailed: Traceback (most recent call last):
>>File
>> "/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
>> line 48, in test_shorts
>>  self.assertEqual((name, i, getattr(b, name)), (name, i,
>> func(byref(b), name)))
>> AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)
>> 
>> First differing element 2:
>> -1
>> 1
>> 
>> - ('M', 1, -1)
>> ?  -
>> 
>> + ('M', 1, 1)
>> 
>> 
>> 
>> 
>>> On 17-Mar-16 23:31, Michael Felt wrote:
>>> a) hope this is not something you expect to be on -list, if so - my
>>> apologies!
>>> 
>>> Getting this message (here using c99 as compiler name, but same issue
>>> with xlc as compiler name)
>>> c99 -qarch=pwr4 -qbitfields=signed -DNDEBUG -O -I. -IInclude
>>> -I./Include -I/data/prj/aixtools/python/python-2.7.11.2/Include
>>> -I/data/prj/aixtools/python/python-2.7.11.2 -c
>>> /data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c
>>> -o
>>> build/temp.aix-5.3-2.7/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.o
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field M must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field N must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field O must be of type signed int,
>>> unsigned int or int.
>>> "/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
>>> line 387.5: 1506-009 (S) Bit field P must be of type signed int,
>>> 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Glenn Linderman

On 3/19/2016 2:37 PM, Serhiy Storchaka wrote:

On 19.03.16 19:36, Glenn Linderman wrote:

On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:

On 16.03.16 08:03, Serhiy Storchaka wrote:
I just tested with Emacs, and it looks that when specify different
codings on two different lines, the first coding wins, but when
specify different codings on the same line, the last coding wins.

Therefore current CPython behavior can be correct, and the regular
expression in PEP 263 should be changed to use greedy repetition.


Just because emacs works that way (and even though I'm an emacs user),
that doesn't mean CPython should act like emacs.


Yes. But current CPython works that way. The behavior of Emacs is the 
argument that maybe this is not a bug.


If CPython properly handles the following line as having only one proper 
coding declaration (utf-8), then I might reluctantly agree that the 
behavior of Emacs might be a relevant argument.  Otherwise, vehemently 
not relevant.


  # -*- coding: utf-8 -*- this file does not use coding: latin-1





(4) there is no benefit to specifying the coding twice on a line, it
only adds confusion, whether in CPython, emacs, or vim.
(4a) Here's an untested line that emacs would interpret as utf-8, and
CPython with the greedy regulare expression would interpret as latin-1,
because emacs looks only between the -*- pair, and CPython ignores that.
   # -*- coding: utf-8 -*- this file does not use coding: latin-1


Since Emacs allows to specify the coding twice on a line, and this can 
be ambiguous, and CPython already detects some ambiguous situations 
(UTF-8 BOM and non-UTF-8 coding cookie), it may be worth to add a 
check that the coding is specified only once on a line.


Diagnosing ambiguous conditions, even including my example above, might 
be useful... for a few files... is it worth the effort? What % of .py 
sources have coding specifications? What % of those have two?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 19.03.16 19:36, Glenn Linderman wrote:

On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:

On 16.03.16 08:03, Serhiy Storchaka wrote:
I just tested with Emacs, and it looks that when specify different
codings on two different lines, the first coding wins, but when
specify different codings on the same line, the last coding wins.

Therefore current CPython behavior can be correct, and the regular
expression in PEP 263 should be changed to use greedy repetition.


Just because emacs works that way (and even though I'm an emacs user),
that doesn't mean CPython should act like emacs.


Yes. But current CPython works that way. The behavior of Emacs is the 
argument that maybe this is not a bug.



(4) there is no benefit to specifying the coding twice on a line, it
only adds confusion, whether in CPython, emacs, or vim.
(4a) Here's an untested line that emacs would interpret as utf-8, and
CPython with the greedy regulare expression would interpret as latin-1,
because emacs looks only between the -*- pair, and CPython ignores that.
   # -*- coding: utf-8 -*- this file does not use coding: latin-1


Since Emacs allows to specify the coding twice on a line, and this can 
be ambiguous, and CPython already detects some ambiguous situations 
(UTF-8 BOM and non-UTF-8 coding cookie), it may be worth to add a check 
that the coding is specified only once on a line.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2016-03-19 Thread Python tracker

ACTIVITY SUMMARY (2016-03-11 - 2016-03-18)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open5459 ( +5)
  closed 32885 (+43)
  total  38344 (+48)

Open issues with patches: 2375 


Issues opened (36)
==

#15660: Clarify 0 prefix for width specifier in str.format doc,
http://bugs.python.org/issue15660  reopened by terry.reedy

#22758: Regression in Python 3.2 cookie parsing
http://bugs.python.org/issue22758  reopened by berker.peksag

#25934: ICC compiler: ICC treats denormal floating point numbers as 0.
http://bugs.python.org/issue25934  reopened by zach.ware

#26270: Support for read()/write()/select() on asyncio
http://bugs.python.org/issue26270  reopened by gvanrossum

#26481: unittest discovery process not working without .py source file
http://bugs.python.org/issue26481  reopened by rbcollins

#26541: Add stop_after parameter to setup()
http://bugs.python.org/issue26541  opened by memeplex

#26543: imaplib noop Debug
http://bugs.python.org/issue26543  opened by Stephen.Evans

#26544: platform.libc_ver() returns incorrect version number
http://bugs.python.org/issue26544  opened by Thomas.Waldmann

#26545: os.walk is limited by python's recursion limit
http://bugs.python.org/issue26545  opened by Thomas.Waldmann

#26546: Provide translated french translation on docs.python.org
http://bugs.python.org/issue26546  opened by sizeof

#26547: Undocumented use of the term dictproxy in vars() documentation
http://bugs.python.org/issue26547  opened by sizeof

#26549: co_stacksize is calculated from unoptimized code
http://bugs.python.org/issue26549  opened by ztane

#26550: documentation minor issue : "Step back: WSGI" section from "HO
http://bugs.python.org/issue26550  opened by Alejandro Soini

#26552: Failing ensure_future still creates a Task
http://bugs.python.org/issue26552  opened by gordon

#26553: Write HTTP in uppercase
http://bugs.python.org/issue26553  opened by Sudheer Satyanarayana

#26554: PC\bdist_wininst\install.c: Missing call to fclose()
http://bugs.python.org/issue26554  opened by maddin200

#26556: Update expat to 2.2.1
http://bugs.python.org/issue26556  opened by christian.heimes

#26557: dictviews methods not present on shelve objects
http://bugs.python.org/issue26557  opened by Michael Crouch

#26559: logging.handlers.MemoryHandler flushes on shutdown but not rem
http://bugs.python.org/issue26559  opened by David Escott

#26560: Error in assertion in wsgiref.handlers.BaseHandler.start_respo
http://bugs.python.org/issue26560  opened by inglesp

#26565: [ctypes] Add value attribute to non basic pointers.
http://bugs.python.org/issue26565  opened by memeplex

#26566: Failures on FreeBSD CURRENT buildbot
http://bugs.python.org/issue26566  opened by haypo

#26567: ResourceWarning: Use tracemalloc to display the traceback wher
http://bugs.python.org/issue26567  opened by haypo

#26568: Add a new warnings.showmsg() function taking a warnings.Warnin
http://bugs.python.org/issue26568  opened by haypo

#26571: turtle regression in 3.5
http://bugs.python.org/issue26571  opened by Ellison Marks

#26574: replace_interleave can be optimized for single character byte 
http://bugs.python.org/issue26574  opened by Josh Snider

#26576: Tweak wording of decorator docos
http://bugs.python.org/issue26576  opened by Rosuav

#26577: inspect.getclosurevars returns incorrect variable when using c
http://bugs.python.org/issue26577  opened by Ryan Fox

#26578: Bad BaseHTTPRequestHandler response when using HTTP/0.9
http://bugs.python.org/issue26578  opened by xiang.zhang

#26579: Support pickling slots in subclasses of common classes
http://bugs.python.org/issue26579  opened by serhiy.storchaka

#26581: Double coding cookie
http://bugs.python.org/issue26581  opened by serhiy.storchaka

#26582: asyncio documentation links to wrong CancelledError
http://bugs.python.org/issue26582  opened by awilfox

#26584: pyclbr module needs to be more flexible on loader support
http://bugs.python.org/issue26584  opened by eric.snow

#26585: Use html.escape to replace _quote_html in http.server
http://bugs.python.org/issue26585  opened by xiang.zhang

#26586: Simple enhancement to BaseHTTPRequestHandler
http://bugs.python.org/issue26586  opened by xiang.zhang

#26587: Possible duplicate entries in sys.path if .pth files are used 
http://bugs.python.org/issue26587  opened by tds333



Most recent 15 issues with no replies (15)
==

#26584: pyclbr module needs to be more flexible on loader support
http://bugs.python.org/issue26584

#26582: asyncio documentation links to wrong CancelledError
http://bugs.python.org/issue26582

#26581: Double coding cookie
http://bugs.python.org/issue26581

#26579: Support pickling slots in subclasses of common classes
http://bugs.python.org/issue26579

#26577: inspect.getclosurevars returns incorrect variable 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread M.-A. Lemburg
On 17.03.2016 18:53, Serhiy Storchaka wrote:
> On 17.03.16 19:23, M.-A. Lemburg wrote:
>> On 17.03.2016 15:02, Serhiy Storchaka wrote:
>>> On 17.03.16 15:14, M.-A. Lemburg wrote:
 On 17.03.2016 01:29, Guido van Rossum wrote:
> Should we recommend that everyone use tokenize.detect_encoding()?

 I'd prefer a separate utility for this somewhere, since
 tokenize.detect_encoding() is not available in Python 2.

 I've attached an example implementation with tests, which works
 in Python 2.7 and 3.
>>>
>>> Sorry, but this code doesn't match the behaviour of Python interpreter,
>>> nor other tools. I suggest to backport tokenize.detect_encoding() (but
>>> be aware that the default encoding in Python 2 is ASCII, not UTF-8).
>>
>> Yes, I got the default for Python 3 wrong. I'll fix that. Thanks
>> for the note.
>>
>> What other aspects are different than what Python implements ?
> 
> 1. If there is a BOM and coding cookie, the source encoding is "utf-8-sig".

Ok, that makes sense (even though it's not mandated by the PEP;
the utf-8-sig codec didn't exist yet).

> 2. If there is a BOM and coding cookie is not 'utf-8', this is an error.

It's an error for Python, but why should a detection function
always raise an error for this case ? It would probably be a good
idea to have an errors parameter to leave this to the use to decide.

Same for unknown encodings.

> 3. If the first line is not blank or comment line, the coding cookie is
> not searched in the second line.

Hmm, the PEP does allow having the coding cookie in the
second line, even if the first line is not a comment. Perhaps
that's not really needed.

> 4. Encoding name should be canonized. "UTF8", "utf8", "utf_8" and
> "utf-8" is the same encoding (and all are changed to "utf-8-sig" with BOM).

Well, that's cosmetics :-) The codec system will take care of
this when needed.

> 5. There isn't the limit of 400 bytes. Actually there is a bug with
> handling long lines in current code, but even with this bug the limit is
> larger.

I think it's a reasonable limit, since shebang lines may only be
127 long on at least Linux (and probably several other Unix systems
as well).

But just in case, I made this configurable :-)

> 6. I made a mistake in the regular expression, missed the underscore.

I added it.

> tokenize.detect_encoding() is the closest imitation of the behavior of
> Python interpreter.

Probably, but that doesn't us on Python 2, right ?

I'll upload the script to github later today or tomorrow to
continue development.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Mar 17 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/

2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89
2016-02-19: Released eGenix PyRun 2.1.2 ...   http://egenix.com/go88

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Guido van Rossum
All that sounds fine!

On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah  wrote:
> Guido van Rossum  python.org> writes:
>> So should the preprocessing step just be s.replace('_', ''), or should
>> it reject underscores that don't follow the rules from the PEP
>> (perhaps augmented so they follow the spirit of the PEP and the letter
>> of the IBM spec)?
>>
>> Honestly I think it's also fine if specifying this exactly is left out
>> of the PEP, and handled by whoever adds this to Decimal. Having a PEP
>> to work from for the language spec and core builtins (int(), float()
>> complex()) is more important.
>
> I'd keep it simple for Decimal: Remove left and right whitespace (we're
> already doing this), then remove underscores from the remaining string
> (which must not contain any further whitespace), then use the IBM grammar.
>
>
> We could add a clause to the PEP that only those strings that follow
> the spirit of the PEP are guaranteed to be accepted in the future.
>
>
> One reason for keeping it simple is that I would not like to slow down
> string conversion, but thinking about two grammars is also a problem --
> part of the string conversion in libmpdec is modeled in ACL2, which
> would be invalidated or at least complicated with two grammars.
>
>
>
> Stefan Krah
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Stefan Krah
Guido van Rossum  python.org> writes:
> So should the preprocessing step just be s.replace('_', ''), or should
> it reject underscores that don't follow the rules from the PEP
> (perhaps augmented so they follow the spirit of the PEP and the letter
> of the IBM spec)?
> 
> Honestly I think it's also fine if specifying this exactly is left out
> of the PEP, and handled by whoever adds this to Decimal. Having a PEP
> to work from for the language spec and core builtins (int(), float()
> complex()) is more important.

I'd keep it simple for Decimal: Remove left and right whitespace (we're
already doing this), then remove underscores from the remaining string
(which must not contain any further whitespace), then use the IBM grammar.


We could add a clause to the PEP that only those strings that follow
the spirit of the PEP are guaranteed to be accepted in the future.


One reason for keeping it simple is that I would not like to slow down
string conversion, but thinking about two grammars is also a problem --
part of the string conversion in libmpdec is modeled in ACL2, which
would be invalidated or at least complicated with two grammars.



Stefan Krah

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Stephen J. Turnbull
Glenn Linderman writes:
 > On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:

 > > Therefore current CPython behavior can be correct, and the regular 
 > > expression in PEP 263 should be changed to use greedy repetition.
 > 
 > Just because emacs works that way (and even though I'm an emacs user), 
 > that doesn't mean CPython should act like emacs.
 > 
 > (1) CPython should not necessarily act like emacs,

We can't treat Emacs as a spec, because Emacs doesn't follow specs,
doesn't respect standards, and above a certain level of inconvenience
to developers doesn't respect backward compatibility.  There's never
any guarantee that Emacs will do the same thing tomorrow that it does
today, although inertia has mostly the same effect.

In this case, there's a reason why Emacs behaves the way it does,
which is that you can put an arbitrary sequence of variable
assignments in "-*- ... -*-" and they will be executed in order.  So
it makes sense that "last coding wins".  But pragmas are severely
deprecated in Python; cookies got a very special exception.  So that
rationale can't apply to Python.

 > (4) there is no benefit to specifying the coding twice on a line, it 
 > only adds confusion, whether in CPython, emacs, or vim.

Indeed.  I see no point in reading past the first cookie found
(whether a valid codec or not), unless an error would be raised.  That
might be a good idea, but I doubt it's worth the implementation
complexity.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Guido van Rossum
So should the preprocessing step just be s.replace('_', ''), or should
it reject underscores that don't follow the rules from the PEP
(perhaps augmented so they follow the spirit of the PEP and the letter
of the IBM spec)?

Honestly I think it's also fine if specifying this exactly is left out
of the PEP, and handled by whoever adds this to Decimal. Having a PEP
to work from for the language spec and core builtins (int(), float()
complex()) is more important.

On Sat, Mar 19, 2016 at 10:24 AM, Stefan Krah  wrote:
>
> Guido van Rossum  python.org> writes:
>> I don't care too much either way, but I think passing underscores to the
> constructor shouldn't be affected by the context -- the underscores are just
> removed before parsing the number. But if it's too complicated to implement
> I'm fine with punting.
>
> Just removing the underscores would be fine. The problem is that per
> the PEP the conversion should happen according the Python float grammar
> but the actual decimal grammar is the one from the IBM specification.
>
> I'd much rather express the problem like you did above: A preprocessing
> step followed by the IBM specification grammar.
>
>
>
> Stefan Krah
>
>
>
>
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Glenn Linderman

On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:

On 16.03.16 08:03, Serhiy Storchaka wrote:

On 15.03.16 22:30, Guido van Rossum wrote:

I came across a file that had two different coding cookies -- one on
the first line and one on the second. CPython uses the first, but mypy
happens to use the second. I couldn't find anything in the spec or
docs ruling out the second interpretation. Does anyone have a
suggestion (apart from following CPython)?

Reference: https://github.com/python/mypy/issues/1281


There is similar question. If a file has two different coding cookies on
the same line, what should win? Currently the last cookie wins, in
CPython parser, in the tokenize module, in IDLE, and in number of other
code. I think this is a bug.


I just tested with Emacs, and it looks that when specify different 
codings on two different lines, the first coding wins, but when 
specify different codings on the same line, the last coding wins.


Therefore current CPython behavior can be correct, and the regular 
expression in PEP 263 should be changed to use greedy repetition.


Just because emacs works that way (and even though I'm an emacs user), 
that doesn't mean CPython should act like emacs.


(1) CPython should not necessarily act like emacs, unless the coding 
syntax exactly matches emacs, rather than the generic coding that 
CPython interprets, that matches emacs, vim, and other similar things 
that both emacs and vim would ignore.
(1a) Maybe if a similar test were run on vim with its syntax, and it 
also works the same way, then one might think it is a trend worth 
following, but it is not clear to this non-vim user that vim syntax 
allows more than one coding specification per line.


(2) emacs has no requirement that the coding be placed on the first two 
lines. It specifically looks at the second line only if the first line 
has a “ #! ” or a “ '\" ” (for troff). (according to docs, not 
experimentation)


(3) emacs also allows for Local Variables to be specified at the end of 
the file.  If CPython were really to act like emacs, then it would need 
to allow for that too.


(4) there is no benefit to specifying the coding twice on a line, it 
only adds confusion, whether in CPython, emacs, or vim.
(4a) Here's an untested line that emacs would interpret as utf-8, and 
CPython with the greedy regulare expression would interpret as latin-1, 
because emacs looks only between the -*- pair, and CPython ignores that.

  # -*- coding: utf-8 -*- this file does not use coding: latin-1
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Stefan Krah

Guido van Rossum  python.org> writes:
> I don't care too much either way, but I think passing underscores to the
constructor shouldn't be affected by the context -- the underscores are just
removed before parsing the number. But if it's too complicated to implement
I'm fine with punting.

Just removing the underscores would be fine. The problem is that per
the PEP the conversion should happen according the Python float grammar 
but the actual decimal grammar is the one from the IBM specification.

I'd much rather express the problem like you did above: A preprocessing
step followed by the IBM specification grammar.



Stefan Krah






___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Guido van Rossum
I don't care too much either way, but I think passing underscores to the
constructor shouldn't be affected by the context -- the underscores are
just removed before parsing the number. But if it's too complicated to
implement I'm fine with punting.

--Guido (mobile)
On Mar 19, 2016 6:24 AM, "Nick Coghlan"  wrote:

> On 19 March 2016 at 16:44, Georg Brandl  wrote:
> > On the other hand, assuming decimal literals are introduced at some
> > point, they would almost definitely need to support underscores.
> > Of course, the decision whether to modify the Decimal constructor
> > can be postponed until that time.
>
> The idea of Decimal literals is complicated significantly by their
> current context dependent behaviour (especially when it comes to
> rounding), so I'd suggest leaving them alone in the context of this
> PEP.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread M.-A. Lemburg
On 17.03.2016 01:29, Guido van Rossum wrote:
> I've updated the PEP. Please review. I decided not to update the
> Unicode howto (the thing is too obscure). Serhiy, you're probably in a
> better position to fix the code looking for cookies to pick the first
> one if there are two on the same line (or do whatever you think should
> be done there).

Thanks, will do.

> Should we recommend that everyone use tokenize.detect_encoding()?

I'd prefer a separate utility for this somewhere, since
tokenize.detect_encoding() is not available in Python 2.

I've attached an example implementation with tests, which works
in Python 2.7 and 3.

> On Wed, Mar 16, 2016 at 5:05 PM, Guido van Rossum  wrote:
>> On Wed, Mar 16, 2016 at 12:59 AM, M.-A. Lemburg  wrote:
>>> The only reason to read up to two lines was to address the use of
>>> the shebang on Unix, not to be able to define two competing
>>> source code encodings :-)
>>
>> I know. I was just surprised that the PEP was sufficiently vague about
>> it that when I found that mypy picked the second if there were two, I
>> couldn't prove to myself that it was violating the PEP. I'd rather
>> clarify the PEP than rely on the reasoning presented earlier here.

I suppose it's a rather rare case, since it's the first time
that I heard about anyone thinking that a possible second line
could be picked - after 15 years :-)

>> I don't like erroring out when there are two different cookies on two
>> lines; I feel that the spirit of the PEP is to read up to two lines
>> until a cookie is found, whichever comes first.
>>
>> I will update the regex in the PEP too (or change the wording to avoid 
>> "match").
>>
>> I'm not sure what to do if there are two cooking on one line. If
>> CPython currently picks the latter we may want to preserve that
>> behavior.
>>
>> Should we recommend that everyone use tokenize.detect_encoding()?
>>
>> --
>> --Guido van Rossum (python.org/~guido)
> 
> 
> 

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Mar 17 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/

2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89
2016-02-19: Released eGenix PyRun 2.1.2 ...   http://egenix.com/go88

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

#!/usr/bin/python
"""
Utility to detect the source code encoding of a Python file.

Marc-Andre Lemburg, 2016.

Supports Python 2.7 and 3.

"""
import sys
import re
import codecs

# Debug output ?
_debug = True

# PEP 263 RE
PEP263 = re.compile(b'^[ \t]*#.*?coding[:=][ \t]*([-.a-zA-Z0-9]+)',
re.MULTILINE)

###

def detect_source_encoding(code, buffer_size=400):

""" Detect and return the source code encoding of the Python code
given in code.

code must be given as bytes.

The function uses a buffer to determine the first two code lines
with a default size of 400 bytes/code points.  This can be adjusted
using the buffer_size parameter.

"""
# Get the first two lines
first_two_lines = b'\n'.join(code[:buffer_size].splitlines()[:2])
# BOMs override any source code encoding comments
if first_two_lines.startswith(codecs.BOM):
return 'utf-8'
# .search() picks the first occurrance
m = PEP263.search(first_two_lines)
if m is None:
return 'ascii'
return m.group(1).decode('ascii')

# Tests

def _test():

l = (
  (b"""\
# No encoding
""", 'ascii'),
  (b"""\
# coding: latin-1
""", 'latin-1'),
  (b"""\
#!/usr/bin/python
# coding: utf-8
""", 'utf-8'),
  (b"""\
coding=123
# The above could be detected as source code encoding
""", 'ascii'),
  (b"""\
# coding: latin-1
# coding: utf-8
""", 'latin-1'),
  (b"""\
# No encoding on first line
# No encoding on second line
# coding: utf-8
""", 'ascii'),
  (codecs.BOM + b"""\
# No encoding
""", 'utf-8'),
  (codecs.BOM + b"""\
# BOM and encoding
# coding: latin-1
""", 'utf-8'),
)
for code, encoding in l:
if _debug:
print ('=' * 72)
print ('Checking:')
print ('-' * 72)
print (code.decode('latin-1'))
print ('-' * 72)
detected_encoding = detect_source_encoding(code)
if _debug:
print ('detected: %s, expected: %s' % 
   (detected_encoding, encoding))
assert 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 16.03.16 08:03, Serhiy Storchaka wrote:

On 15.03.16 22:30, Guido van Rossum wrote:

I came across a file that had two different coding cookies -- one on
the first line and one on the second. CPython uses the first, but mypy
happens to use the second. I couldn't find anything in the spec or
docs ruling out the second interpretation. Does anyone have a
suggestion (apart from following CPython)?

Reference: https://github.com/python/mypy/issues/1281


There is similar question. If a file has two different coding cookies on
the same line, what should win? Currently the last cookie wins, in
CPython parser, in the tokenize module, in IDLE, and in number of other
code. I think this is a bug.


I just tested with Emacs, and it looks that when specify different 
codings on two different lines, the first coding wins, but when specify 
different codings on the same line, the last coding wins.


Therefore current CPython behavior can be correct, and the regular 
expression in PEP 263 should be changed to use greedy repetition.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread M.-A. Lemburg
On 17.03.2016 15:55, Guido van Rossum wrote:
> On Thu, Mar 17, 2016 at 5:04 AM, Serhiy Storchaka  wrote:
>>> Should we recommend that everyone use tokenize.detect_encoding()?
>>
>> Likely. However the interface of tokenize.detect_encoding() is not very
>> simple.
> 
> I just found that out yesterday. You have to give it a readline()
> function, which is cumbersome if all you have is a (byte) string and
> you don't want to split it on lines just yet. And the readline()
> function raises SyntaxError when the encoding isn't right. I wish
> there were a lower-level helper that just took a line and told you
> what the encoding in it was, if any. Then the rest of the logic can be
> handled by the caller (including the logic of trying up to two lines).

I've uploaded the code I posted yesterday, modified to address
some of the issues it had to github:

https://github.com/malemburg/python-snippets/blob/master/detect_source_encoding.py

I'm pretty sure the two-lines read can be optimized away and
put straight into the regular expression used for matching.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Mar 18 2016)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> Python Database Interfaces ...   http://products.egenix.com/
>>> Plone/Zope Database Interfaces ...   http://zope.egenix.com/

2016-03-07: Released eGenix pyOpenSSL 0.13.14 ... http://egenix.com/go89
2016-02-19: Released eGenix PyRun 2.1.2 ...   http://egenix.com/go88

::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
  http://www.malemburg.com/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

From openstack-dev-bounces+archive=mail-archive@lists.openstack.org Sat Mar 
19 08:12:23 2016
Return-path: 

Envelope-to: arch...@mail-archive.com
Delivery-date: Sat, 19 Mar 2016 08:12:23 -0700
Received: from bolt10a.mxthunder.net ([209.105.224.168])
by mail-archive.com with esmtp (Exim 4.76)
(envelope-from 
)
id 1ahIYA-0002fr-Cy
for arch...@mail-archive.com; Sat, 19 Mar 2016 08:12:22 -0700
Received: by bolt10a.mxthunder.net (Postfix, from userid 12345)
id 3qRVGK3w2Rz19ktG; Fri, 18 Mar 2016 08:56:36 -0700 (PDT)
Received: from lists.openstack.org (lists.openstack.org [50.56.173.222])
(using TLSv1 with cipher AES256-SHA (256/256 bits))
(No client certificate requested)
by bolt10a.mxthunder.net (Postfix) with ESMTPS id 3qRVFs4wjPz19kcC
for ; Fri, 18 Mar 2016 08:56:33 -0700 (PDT)
Received: from localhost ([127.0.0.1] helo=lists.openstack.org)
by lists.openstack.org with esmtp (Exim 4.76)
(envelope-from )
id 1agwhl-0005a9-59; Fri, 18 Mar 2016 15:52:49 +
Received: from g4t3426.houston.hp.com ([15.201.208.54])
 by lists.openstack.org with esmtp (Exim 4.76)
 (envelope-from ) id 1agwhj-0005YM-O5
 for openstack-...@lists.openstack.org; Fri, 18 Mar 2016 15:52:47 +
Received: from G4W9121.americas.hpqcorp.net (g4w9121.houston.hp.com
 [16.210.21.16]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits))
 (No client certificate requested)
 by g4t3426.houston.hp.com (Postfix) with ESMTPS id 605A664
 for ; Fri, 18 Mar 2016 15:52:47 + (UTC)
Received: from G4W9121.americas.hpqcorp.net (16.210.21.16) by
 G4W9121.americas.hpqcorp.net (16.210.21.16) with Microsoft SMTP Server (TLS)
 id 15.0.1076.9; Fri, 18 Mar 2016 15:52:38 +
Received: from G4W6304.americas.hpqcorp.net (16.210.26.229) by
 G4W9121.americas.hpqcorp.net (16.210.21.16) with Microsoft SMTP Server (TLS)
 id 15.0.1076.9 via Frontend Transport; Fri, 18 Mar 2016 15:52:38 +
Received: from G9W0750.americas.hpqcorp.net ([169.254.9.246]) by
 G4W6304.americas.hpqcorp.net ([16.210.26.229]) with mapi id 14.03.0169.001;
 Fri, 18 Mar 2016 15:52:38 +
From: "Hayes, Graham" 
To: "OpenStack Development Mailing List (not for usage questions)"
 
Thread-Topic: [openstack-dev] [all][infra][ptls] tagging reviews, making
 tags searchable
Thread-Index: AQHRgSbdQRQMpxFOeUKGrubFuR+3RA==
Date: Fri, 18 Mar 2016 15:52:37 +
Message-ID: 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 17.03.16 21:11, Guido van Rossum wrote:

I tried this and it was too painful, so now I've just
changed the regex that mypy uses to use non-eager matching
(https://github.com/python/mypy/commit/b291998a46d580df412ed28af1ba1658446b9fe5).


\s* matches newlines.

{0,1}? is the same as ??.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitfields - short - and xlc compiler

2016-03-19 Thread Michael Felt

Update:
Is this going to be impossible?

test_short fails om AIX when using xlC in any case. How terrible is this?

==
FAIL: test_shorts (ctypes.test.test_bitfields.C_Test)
--
Traceback (most recent call last):
  File 
"/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py", 
line 48, in test_shorts
self.assertEqual((name, i, getattr(b, name)), (name, i, 
func(byref(b), name)))

AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)

First differing element 2:
-1
1

- ('M', 1, -1)
?  -

+ ('M', 1, 1)

--
Ran 440 tests in 1.538s

FAILED (failures=1, skipped=91)
Traceback (most recent call last):
  File "./Lib/test/test_ctypes.py", line 15, in 
test_main()
  File "./Lib/test/test_ctypes.py", line 12, in test_main
run_unittest(unittest.TestSuite(suites))
  File 
"/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py", 
line 1428, in run_unittest

_run_suite(suite)
  File 
"/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py", 
line 1411, in _run_suite

raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
  File 
"/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py", 
line 48, in test_shorts
self.assertEqual((name, i, getattr(b, name)), (name, i, 
func(byref(b), name)))

AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)

First differing element 2:
-1
1

- ('M', 1, -1)
?  -

+ ('M', 1, 1)




On 17-Mar-16 23:31, Michael Felt wrote:
a) hope this is not something you expect to be on -list, if so - my 
apologies!


Getting this message (here using c99 as compiler name, but same issue 
with xlc as compiler name)
c99 -qarch=pwr4 -qbitfields=signed -DNDEBUG -O -I. -IInclude 
-I./Include -I/data/prj/aixtools/python/python-2.7.11.2/Include 
-I/data/prj/aixtools/python/python-2.7.11.2 -c 
/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c 
-o 
build/temp.aix-5.3-2.7/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.o
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field M must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field N must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field O must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field P must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field Q must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field R must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field S must be of type signed int, 
unsigned int or int.


for:

struct BITS {
int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
short M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

in short xlC v11 does not like short (xlC v7 might have accepted it, 
but "32-bit machines were common then". I am guessing that 16-bit is 
not well liked on 64-bit hw now.


reference for xlC v7, where short was (apparently) still accepted: 
http://www.serc.iisc.ernet.in/facilities/ComputingFacilities/systems/cluster/vac-7.0/html/language/ref/clrc03defbitf.htm 



I am taking this is from xlC v7 documentation from the URL, not 
because I know it personally.


So - my question: if "short" is unacceptable for POWER, or maybe only 
xlC (not tried with gcc) - how terrible is this, and is it possible to 
adjust the test so - the test is accurate?


I am going to modify the test code so it is
struct BITS {
   signed  int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
   unsigned int M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

And see what happens - BUT - what does this have for impact on python 
- assuming that "short" bitfields are not supported?


p.s. not submitting this a bug (now) as it may just be that "you" 
consider it a bug in xlC to not support (signed) short bit fields.


p.p.s. Note: xlc, by default, considers bitfields to be unsigned. I 
was trying to force them to signed with -qbitfields=signed - and I 
still got messages. So, going back to defaults.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 

[Python-Dev] GSoC: looking for a student to help on FAT Python

2016-03-19 Thread Victor Stinner
Hi,

I am now looking for a Google Summer of Code (GSoC) student to help me
of my FAT Python project, a new static optimizer for CPython 3.6 using
specialization with guards.

The FAT Python project is already fully functional, the code is
written and tested. I need help to implement new efficient
optimizations to "finish" the project and prove that my design allows
to really run applications faster.

FAT Python project:
https://faster-cpython.readthedocs.org/fat_python.html

fatoptimizer module:
https://fatoptimizer.readthedocs.org/

Slides of my talk at FOSDEM:
https://github.com/haypo/conf/raw/master/2016-FOSDEM/fat_python.pdf

The "fatoptimizer" optimizer is written in pure Python. I'm looking
for a student who knows compilers especially static optimizations like
loop unrolling and function inlining.

For concrete tasks, take a look at the TODO list:
https://fatoptimizer.readthedocs.org/en/latest/todo.html

Hurry up students! The deadline is in 1 week! (Sorry, I'm late for my
project...)

--

PSF GSoC, Python core projects:
https://wiki.python.org/moin/SummerOfCode/2016/python-core

All PSF GSoC projects:
https://wiki.python.org/moin/SummerOfCode/2016

GSOC:
https://developers.google.com/open-source/gsoc/

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] bitfields - short - and xlc compiler

2016-03-19 Thread Michael Felt
a) hope this is not something you expect to be on -list, if so - my 
apologies!


Getting this message (here using c99 as compiler name, but same issue 
with xlc as compiler name)
c99 -qarch=pwr4 -qbitfields=signed -DNDEBUG -O -I. -IInclude -I./Include 
-I/data/prj/aixtools/python/python-2.7.11.2/Include 
-I/data/prj/aixtools/python/python-2.7.11.2 -c 
/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c 
-o 
build/temp.aix-5.3-2.7/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.o
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field M must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field N must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field O must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field P must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field Q must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field R must be of type signed int, 
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c", 
line 387.5: 1506-009 (S) Bit field S must be of type signed int, 
unsigned int or int.


for:

struct BITS {
int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
short M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

in short xlC v11 does not like short (xlC v7 might have accepted it, but 
"32-bit machines were common then". I am guessing that 16-bit is not 
well liked on 64-bit hw now.


reference for xlC v7, where short was (apparently) still accepted: 
http://www.serc.iisc.ernet.in/facilities/ComputingFacilities/systems/cluster/vac-7.0/html/language/ref/clrc03defbitf.htm


I am taking this is from xlC v7 documentation from the URL, not because 
I know it personally.


So - my question: if "short" is unacceptable for POWER, or maybe only 
xlC (not tried with gcc) - how terrible is this, and is it possible to 
adjust the test so - the test is accurate?


I am going to modify the test code so it is
struct BITS {
   signed  int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
   unsigned int M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

And see what happens - BUT - what does this have for impact on python - 
assuming that "short" bitfields are not supported?


p.s. not submitting this a bug (now) as it may just be that "you" 
consider it a bug in xlC to not support (signed) short bit fields.


p.p.s. Note: xlc, by default, considers bitfields to be unsigned. I was 
trying to force them to signed with -qbitfields=signed - and I still got 
messages. So, going back to defaults.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Terry Reedy

On 3/16/2016 3:14 AM, Serhiy Storchaka wrote:

On 16.03.16 02:28, Guido van Rossum wrote:

I agree that the spirit of the PEP is to stop at the first coding
cookie found. Would it be okay if I updated the PEP to clarify this?
I'll definitely also update the docs.


Could you please also update the regular expression in PEP 263 to
"^[ \t\v]*#.*?coding[:=][ \t]*([-.a-zA-Z0-9]+)"?

Coding cookie must be in comment, only the first occurrence in the line
must be taken to account (here is a bug in CPython), encoding name must
be ASCII, and there must not be any Python statement on the line that
contains the encoding declaration. [1]

[1] https://bugs.python.org/issue18873


Also, I think there should be one 'official' function somewhere in the 
stdlib to get and return the encoding declaration. The patch for the 
issue above had to make the same change in four places other than tests, 
a violent violation of DRY.


--
Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Nick Coghlan
On 19 March 2016 at 16:44, Georg Brandl  wrote:
> On the other hand, assuming decimal literals are introduced at some
> point, they would almost definitely need to support underscores.
> Of course, the decision whether to modify the Decimal constructor
> can be postponed until that time.

The idea of Decimal literals is complicated significantly by their
current context dependent behaviour (especially when it comes to
rounding), so I'd suggest leaving them alone in the context of this
PEP.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Guido van Rossum
On Thu, Mar 17, 2016 at 5:04 AM, Serhiy Storchaka  wrote:
>> Should we recommend that everyone use tokenize.detect_encoding()?
>
> Likely. However the interface of tokenize.detect_encoding() is not very
> simple.

I just found that out yesterday. You have to give it a readline()
function, which is cumbersome if all you have is a (byte) string and
you don't want to split it on lines just yet. And the readline()
function raises SyntaxError when the encoding isn't right. I wish
there were a lower-level helper that just took a line and told you
what the encoding in it was, if any. Then the rest of the logic can be
handled by the caller (including the logic of trying up to two lines).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Interested in the GSoC idea 'Roundup - GitHub integration'

2016-03-19 Thread Wasim Thabraze
Hello everyone,

I am Wasim Thabraze, a Computer Science Undergraduate. I have thoroughly
gone through the Core-Python GSoC ideas page and have narrowed down my
choices to the project 'Improving Roundup GitHub integration'.

I have experience in building stuff that are connected to GitHub. Openflock
(http://www.openflock.co) is one of such products that I developed.

Can someone please help me in knowing more about the project? I wanted to
know how and where GitHub should be integrated in the
https://bugs.python.org


I hope I can code with Core Python this summer.


Regards,
Wasim
www.thabraze.me
github.com/waseem18
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bitfields - short - and xlc compiler

2016-03-19 Thread MRAB

On 2016-03-18 00:56, Michael Felt wrote:

Update:
Is this going to be impossible?

From what I've been able to find out, the C89 standard limits bitfields 
to int, signed int and unsigned int, and the C99 standard added _Bool, 
although some compilers allow other integer types too. It looks like 
your compiler doesn't allow those additional types.



test_short fails om AIX when using xlC in any case. How terrible is this?

==
FAIL: test_shorts (ctypes.test.test_bitfields.C_Test)
--
Traceback (most recent call last):
File
"/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
line 48, in test_shorts
  self.assertEqual((name, i, getattr(b, name)), (name, i,
func(byref(b), name)))
AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)

First differing element 2:
-1
1

- ('M', 1, -1)
?  -

+ ('M', 1, 1)

--
Ran 440 tests in 1.538s

FAILED (failures=1, skipped=91)
Traceback (most recent call last):
File "./Lib/test/test_ctypes.py", line 15, in 
  test_main()
File "./Lib/test/test_ctypes.py", line 12, in test_main
  run_unittest(unittest.TestSuite(suites))
File
"/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
line 1428, in run_unittest
  _run_suite(suite)
File
"/data/prj/aixtools/python/python-2.7.11.2/Lib/test/test_support.py",
line 1411, in _run_suite
  raise TestFailed(err)
test.test_support.TestFailed: Traceback (most recent call last):
File
"/data/prj/aixtools/python/python-2.7.11.2/Lib/ctypes/test/test_bitfields.py",
line 48, in test_shorts
  self.assertEqual((name, i, getattr(b, name)), (name, i,
func(byref(b), name)))
AssertionError: Tuples differ: ('M', 1, -1) != ('M', 1, 1)

First differing element 2:
-1
1

- ('M', 1, -1)
?  -

+ ('M', 1, 1)




On 17-Mar-16 23:31, Michael Felt wrote:

a) hope this is not something you expect to be on -list, if so - my
apologies!

Getting this message (here using c99 as compiler name, but same issue
with xlc as compiler name)
c99 -qarch=pwr4 -qbitfields=signed -DNDEBUG -O -I. -IInclude
-I./Include -I/data/prj/aixtools/python/python-2.7.11.2/Include
-I/data/prj/aixtools/python/python-2.7.11.2 -c
/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c
-o
build/temp.aix-5.3-2.7/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.o
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field M must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field N must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field O must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field P must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field Q must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field R must be of type signed int,
unsigned int or int.
"/data/prj/aixtools/python/python-2.7.11.2/Modules/_ctypes/_ctypes_test.c",
line 387.5: 1506-009 (S) Bit field S must be of type signed int,
unsigned int or int.

for:

struct BITS {
int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
short M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

in short xlC v11 does not like short (xlC v7 might have accepted it,
but "32-bit machines were common then". I am guessing that 16-bit is
not well liked on 64-bit hw now.

reference for xlC v7, where short was (apparently) still accepted:
http://www.serc.iisc.ernet.in/facilities/ComputingFacilities/systems/cluster/vac-7.0/html/language/ref/clrc03defbitf.htm


I am taking this is from xlC v7 documentation from the URL, not
because I know it personally.

So - my question: if "short" is unacceptable for POWER, or maybe only
xlC (not tried with gcc) - how terrible is this, and is it possible to
adjust the test so - the test is accurate?

I am going to modify the test code so it is
struct BITS {
   signed  int A: 1, B:2, C:3, D:4, E: 5, F: 6, G: 7, H: 8, I: 9;
   unsigned int M: 1, N: 2, O: 3, P: 4, Q: 5, R: 6, S: 7;
};

And see what happens - BUT - what does this have for impact on python
- assuming that "short" bitfields are not supported?

p.s. not submitting this a bug (now) as it may just be that "you"
consider it a bug in xlC to not support (signed) short bit fields.

p.p.s. Note: xlc, by default, considers bitfields to be unsigned. I
was trying to 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Brett Cannon
On Thu, 17 Mar 2016 at 07:56 Guido van Rossum  wrote:

> On Thu, Mar 17, 2016 at 5:04 AM, Serhiy Storchaka 
> wrote:
> >> Should we recommend that everyone use tokenize.detect_encoding()?
> >
> > Likely. However the interface of tokenize.detect_encoding() is not very
> > simple.
>
> I just found that out yesterday. You have to give it a readline()
> function, which is cumbersome if all you have is a (byte) string and
> you don't want to split it on lines just yet. And the readline()
> function raises SyntaxError when the encoding isn't right. I wish
> there were a lower-level helper that just took a line and told you
> what the encoding in it was, if any. Then the rest of the logic can be
> handled by the caller (including the logic of trying up to two lines).
>

Since this is for mypy my guess is you only want to know the encoding, but
if you're simply trying to decode bytes of syntax then
importilb.util.decode_source() will handle that for you.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 17.03.16 21:11, Guido van Rossum wrote:

This will raise SyntaxError if the encoding is unknown. That needs to
be caught in mypy's case and then it needs to get the line number from
the exception.


Good point. "lineno" and "offset" attributes of SyntaxError is set to 
None by tokenize.detect_encoding() and to 0 by CPython interpreter. They 
should be set to useful values.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Brett Cannon
Where did this PEP leave off? Anything blocking its acceptance?

On Sat, 13 Feb 2016 at 00:49 Georg Brandl  wrote:

> Hi all,
>
> after talking to Guido and Serhiy we present the next revision
> of this PEP.  It is a compromise that we are all happy with,
> and a relatively restricted rule that makes additions to PEP 8
> basically unnecessary.
>
> I think the discussion has shown that supporting underscores in
> the from-string constructors is valuable, therefore this is now
> added to the specification section.
>
> The remaining open question is about the reverse direction: do
> we want a string formatting modifier that adds underscores as
> thousands separators?
>
> cheers,
> Georg
>
> -
>
> PEP: 515
> Title: Underscores in Numeric Literals
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl, Serhiy Storchaka
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2016
> Python-Version: 3.6
> Post-History: 10-Feb-2016, 11-Feb-2016
>
> Abstract and Rationale
> ==
>
> This PEP proposes to extend Python's syntax and number-from-string
> constructors so that underscores can be used as visual separators for
> digit grouping purposes in integral, floating-point and complex number
> literals.
>
> This is a common feature of other modern languages, and can aid
> readability of long literals, or literals whose value should clearly
> separate into parts, such as bytes or words in hexadecimal notation.
>
> Examples::
>
> # grouping decimal numbers by thousands
> amount = 10_000_000.0
>
> # grouping hexadecimal addresses by words
> addr = 0xDEAD_BEEF
>
> # grouping bits into nibbles in a binary literal
> flags = 0b_0011__0100_1110
>
> # same, for string conversions
> flags = int('0b__', 2)
>
>
> Specification
> =
>
> The current proposal is to allow one underscore between digits, and
> after base specifiers in numeric literals.  The underscores have no
> semantic meaning, and literals are parsed as if the underscores were
> absent.
>
> Literal Grammar
> ---
>
> The production list for integer literals would therefore look like
> this::
>
>integer: decinteger | bininteger | octinteger | hexinteger
>decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>bininteger: "0" ("b" | "B") (["_"] bindigit)+
>octinteger: "0" ("o" | "O") (["_"] octdigit)+
>hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>nonzerodigit: "1"..."9"
>digit: "0"..."9"
>bindigit: "0" | "1"
>octdigit: "0"..."7"
>hexdigit: digit | "a"..."f" | "A"..."F"
>
> For floating-point and complex literals::
>
>floatnumber: pointfloat | exponentfloat
>pointfloat: [digitpart] fraction | digitpart "."
>exponentfloat: (digitpart | pointfloat) exponent
>digitpart: digit (["_"] digit)*
>fraction: "." digitpart
>exponent: ("e" | "E") ["+" | "-"] digitpart
>imagnumber: (floatnumber | digitpart) ("j" | "J")
>
> Constructors
> 
>
> Following the same rules for placement, underscores will be allowed in
> the following constructors:
>
> - ``int()`` (with any base)
> - ``float()``
> - ``complex()``
> - ``Decimal()``
>
>
> Prior Art
> =
>
> Those languages that do allow underscore grouping implement a large
> variety of rules for allowed placement of underscores.  In cases where
> the language spec contradicts the actual behavior, the actual behavior
> is listed.  ("single" or "multiple" refer to allowing runs of
> consecutive underscores.)
>
> * Ada: single, only between digits [8]_
> * C# (open proposal for 7.0): multiple, only between digits [6]_
> * C++14: single, between digits (different separator chosen) [1]_
> * D: multiple, anywhere, including trailing [2]_
> * Java: multiple, only between digits [7]_
> * Julia: single, only between digits (but not in float exponent parts)
>   [9]_
> * Perl 5: multiple, basically anywhere, although docs say it's
>   restricted to one underscore between digits [3]_
> * Ruby: single, only between digits (although docs say "anywhere")
>   [10]_
> * Rust: multiple, anywhere, except for between exponent "e" and digits
>   [4]_
> * Swift: multiple, between digits and trailing (although textual
>   description says only "between digits") [5]_
>
>
> Alternative Syntax
> ==
>
> Underscore Placement Rules
> --
>
> Instead of the relatively strict rule specified above, the use of
> underscores could be limited.  As we seen from other languages, common
> rules include:
>
> * Only one consecutive underscore allowed, and only between digits.
> * Multiple consecutive underscores allowed, but only between digits.
> * Multiple consecutive underscores allowed, in most positions except
>   for the start of the literal, or special positions like after a
>   decimal point.
>
> The syntax in this PEP has ultimately 

Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Glenn Linderman

On 3/16/2016 5:29 PM, Guido van Rossum wrote:

I've updated the PEP. Please review. I decided not to update the
Unicode howto (the thing is too obscure). Serhiy, you're probably in a
better position to fix the code looking for cookies to pick the first
one if there are two on the same line (or do whatever you think should
be done there).

Should we recommend that everyone use tokenize.detect_encoding()?

On Wed, Mar 16, 2016 at 5:05 PM, Guido van Rossum  wrote:

On Wed, Mar 16, 2016 at 12:59 AM, M.-A. Lemburg  wrote:

The only reason to read up to two lines was to address the use of
the shebang on Unix, not to be able to define two competing
source code encodings :-)

I know. I was just surprised that the PEP was sufficiently vague about
it that when I found that mypy picked the second if there were two, I
couldn't prove to myself that it was violating the PEP. I'd rather
clarify the PEP than rely on the reasoning presented earlier here.


Oh sure.  Updating the PEP is the best way forward. But the reasoning, 
although from somewhat vague specifications, seems sound enough to 
declare that it meant "find the first cookie in the first two lines".


Which is what you've said in the update, although not quite that 
tersely.  It now leaves no room for ambiguous interpretations.




I don't like erroring out when there are two different cookies on two
lines; I feel that the spirit of the PEP is to read up to two lines
until a cookie is found, whichever comes first.


The only reason for an error would be to alert people that had depended 
on the bugs, or misinterpretations.


Personally, I think if they haven't converted to UTF-8 by now, they've 
got bigger problems than this change.


I will update the regex in the PEP too (or change the wording to avoid "match").

I'm not sure what to do if there are two cooking on one line. If
CPython currently picks the latter we may want to preserve that
behavior.

Should we recommend that everyone use tokenize.detect_encoding()?

--
--Guido van Rossum (python.org/~guido)





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Guido van Rossum
I've updated the PEP. Please review. I decided not to update the
Unicode howto (the thing is too obscure). Serhiy, you're probably in a
better position to fix the code looking for cookies to pick the first
one if there are two on the same line (or do whatever you think should
be done there).

Should we recommend that everyone use tokenize.detect_encoding()?

On Wed, Mar 16, 2016 at 5:05 PM, Guido van Rossum  wrote:
> On Wed, Mar 16, 2016 at 12:59 AM, M.-A. Lemburg  wrote:
>> The only reason to read up to two lines was to address the use of
>> the shebang on Unix, not to be able to define two competing
>> source code encodings :-)
>
> I know. I was just surprised that the PEP was sufficiently vague about
> it that when I found that mypy picked the second if there were two, I
> couldn't prove to myself that it was violating the PEP. I'd rather
> clarify the PEP than rely on the reasoning presented earlier here.
>
> I don't like erroring out when there are two different cookies on two
> lines; I feel that the spirit of the PEP is to read up to two lines
> until a cookie is found, whichever comes first.
>
> I will update the regex in the PEP too (or change the wording to avoid 
> "match").
>
> I'm not sure what to do if there are two cooking on one line. If
> CPython currently picks the latter we may want to preserve that
> behavior.
>
> Should we recommend that everyone use tokenize.detect_encoding()?
>
> --
> --Guido van Rossum (python.org/~guido)



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Stephen J. Turnbull
Guido van Rossum writes:

 > > Should we recommend that everyone use tokenize.detect_encoding()?

+1

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 17.03.16 02:29, Guido van Rossum wrote:

I've updated the PEP. Please review. I decided not to update the
Unicode howto (the thing is too obscure). Serhiy, you're probably in a
better position to fix the code looking for cookies to pick the first
one if there are two on the same line (or do whatever you think should
be done there).


http://bugs.python.org/issue26581


Should we recommend that everyone use tokenize.detect_encoding()?


Likely. However the interface of tokenize.detect_encoding() is not very 
simple.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What does a double coding cookie mean?

2016-03-19 Thread Serhiy Storchaka

On 17.03.16 19:23, M.-A. Lemburg wrote:

On 17.03.2016 15:02, Serhiy Storchaka wrote:

On 17.03.16 15:14, M.-A. Lemburg wrote:

On 17.03.2016 01:29, Guido van Rossum wrote:

Should we recommend that everyone use tokenize.detect_encoding()?


I'd prefer a separate utility for this somewhere, since
tokenize.detect_encoding() is not available in Python 2.

I've attached an example implementation with tests, which works
in Python 2.7 and 3.


Sorry, but this code doesn't match the behaviour of Python interpreter,
nor other tools. I suggest to backport tokenize.detect_encoding() (but
be aware that the default encoding in Python 2 is ASCII, not UTF-8).


Yes, I got the default for Python 3 wrong. I'll fix that. Thanks
for the note.

What other aspects are different than what Python implements ?


1. If there is a BOM and coding cookie, the source encoding is "utf-8-sig".

2. If there is a BOM and coding cookie is not 'utf-8', this is an error.

3. If the first line is not blank or comment line, the coding cookie is 
not searched in the second line.


4. Encoding name should be canonized. "UTF8", "utf8", "utf_8" and 
"utf-8" is the same encoding (and all are changed to "utf-8-sig" with BOM).


5. There isn't the limit of 400 bytes. Actually there is a bug with 
handling long lines in current code, but even with this bug the limit is 
larger.


6. I made a mistake in the regular expression, missed the underscore.

tokenize.detect_encoding() is the closest imitation of the behavior of 
Python interpreter.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-03-19 Thread Georg Brandl
I'll update the text so that the format() gets promoted from optional
to specified.

There was one point of discussion in the tracker issue that should be
resolved before acceptance: the Decimal constructor is listed as
getting updated to allow underscores, but its syntax is specified
in the Decimal spec: http://speleotrove.com/decimal/daconvs.html

Acccepting underscores would be an extension to the spec, which may
not be what we want to do as otherwise Decimal follows that spec
closely.

On the other hand, assuming decimal literals are introduced at some
point, they would almost definitely need to support underscores.
Of course, the decision whether to modify the Decimal constructor
can be postponed until that time.

cheers,
Georg

On 03/19/2016 01:02 AM, Guido van Rossum wrote:
> I'm happy to accept this PEP as is stands, assuming the authors are
> ready for this news. I recommend also implementing the option from
> footnote [11] (extend the number-to-string formatting language to
> allow ``_`` as a thousans separator).
> 
> On Thu, Mar 17, 2016 at 11:19 AM, Brett Cannon  wrote:
>> Where did this PEP leave off? Anything blocking its acceptance?
>>
>> On Sat, 13 Feb 2016 at 00:49 Georg Brandl  wrote:
>>>
>>> Hi all,
>>>
>>> after talking to Guido and Serhiy we present the next revision
>>> of this PEP.  It is a compromise that we are all happy with,
>>> and a relatively restricted rule that makes additions to PEP 8
>>> basically unnecessary.
>>>
>>> I think the discussion has shown that supporting underscores in
>>> the from-string constructors is valuable, therefore this is now
>>> added to the specification section.
>>>
>>> The remaining open question is about the reverse direction: do
>>> we want a string formatting modifier that adds underscores as
>>> thousands separators?
>>>
>>> cheers,
>>> Georg
>>>
>>> -
>>>
>>> PEP: 515
>>> Title: Underscores in Numeric Literals
>>> Version: $Revision$
>>> Last-Modified: $Date$
>>> Author: Georg Brandl, Serhiy Storchaka
>>> Status: Draft
>>> Type: Standards Track
>>> Content-Type: text/x-rst
>>> Created: 10-Feb-2016
>>> Python-Version: 3.6
>>> Post-History: 10-Feb-2016, 11-Feb-2016
>>>
>>> Abstract and Rationale
>>> ==
>>>
>>> This PEP proposes to extend Python's syntax and number-from-string
>>> constructors so that underscores can be used as visual separators for
>>> digit grouping purposes in integral, floating-point and complex number
>>> literals.
>>>
>>> This is a common feature of other modern languages, and can aid
>>> readability of long literals, or literals whose value should clearly
>>> separate into parts, such as bytes or words in hexadecimal notation.
>>>
>>> Examples::
>>>
>>> # grouping decimal numbers by thousands
>>> amount = 10_000_000.0
>>>
>>> # grouping hexadecimal addresses by words
>>> addr = 0xDEAD_BEEF
>>>
>>> # grouping bits into nibbles in a binary literal
>>> flags = 0b_0011__0100_1110
>>>
>>> # same, for string conversions
>>> flags = int('0b__', 2)
>>>
>>>
>>> Specification
>>> =
>>>
>>> The current proposal is to allow one underscore between digits, and
>>> after base specifiers in numeric literals.  The underscores have no
>>> semantic meaning, and literals are parsed as if the underscores were
>>> absent.
>>>
>>> Literal Grammar
>>> ---
>>>
>>> The production list for integer literals would therefore look like
>>> this::
>>>
>>>integer: decinteger | bininteger | octinteger | hexinteger
>>>decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>>>bininteger: "0" ("b" | "B") (["_"] bindigit)+
>>>octinteger: "0" ("o" | "O") (["_"] octdigit)+
>>>hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>>>nonzerodigit: "1"..."9"
>>>digit: "0"..."9"
>>>bindigit: "0" | "1"
>>>octdigit: "0"..."7"
>>>hexdigit: digit | "a"..."f" | "A"..."F"
>>>
>>> For floating-point and complex literals::
>>>
>>>floatnumber: pointfloat | exponentfloat
>>>pointfloat: [digitpart] fraction | digitpart "."
>>>exponentfloat: (digitpart | pointfloat) exponent
>>>digitpart: digit (["_"] digit)*
>>>fraction: "." digitpart
>>>exponent: ("e" | "E") ["+" | "-"] digitpart
>>>imagnumber: (floatnumber | digitpart) ("j" | "J")
>>>
>>> Constructors
>>> 
>>>
>>> Following the same rules for placement, underscores will be allowed in
>>> the following constructors:
>>>
>>> - ``int()`` (with any base)
>>> - ``float()``
>>> - ``complex()``
>>> - ``Decimal()``
>>>
>>>
>>> Prior Art
>>> =
>>>
>>> Those languages that do allow underscore grouping implement a large
>>> variety of rules for allowed placement of underscores.  In cases where
>>> the language spec contradicts the actual behavior, the actual behavior
>>> is listed.  ("single" or "multiple" refer to allowing runs of
>>> consecutive underscores.)
>>>