Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 14:02:29 +1000 Nick Coghlan wrote: > > If we do go this path, then we should backport the full fix (i.e. > accepting None to indicate repeating forever), rather than just a > partial fix. > > That is, I'm OK with either not backporting anything at all, or > backporting the full change. The only idea I object to is the one of > removing the infinite iteration capability without providing a > replacement spelling for it. I would say not backport at all. The security threat is highly theoretical. If someone blindly accepts user values for repeat(), the user value can just as well be a very large positive with similar effects (e.g. 2**31). Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings wrote: > On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: > > > > On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok > > mailto:[email protected]>> wrote: > > > > In case we are taking "not backporting anything at all" road, what is > > the best fix for the document? > > > > > > I would say no fix is needed for this doc because the signature > > suggests (correctly) that passing times by keyword is not supported. > > Where does it do that? In the "[,times]" spelling, which is the spelling customarily used for positional-only arguments. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27/01/2014 01:52, Nick Coghlan wrote: In 3.5, that will be passing None, rather than -1. For those proposing to change the maintenance releases, how should a user relying on this misbehaviour update their code to handle it? I'm -1 on using None. The code currently rejects anything except an int. The docs don't say anything about using None, except in the "equivalent to" section, which is also the only place where it looks as if times can be a keyword argument. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] News from asyncio
Hi, I'm working for eNovance on the asyncio module, the goal is to use it in the huge OpenStack project (2.5 millions line of code) which currently uses eventlet. I'm trying to fix remaining issues in the asyncio module before Python 3.4 final. The asyncio project is very active but discussions are splitted between its own dedicated mailing list (python-tulip Google group), Tulip bug tracker and Python bug tracker. Please join Tulip mailing list if you are interested to contribute. http://code.google.com/p/tulip/ I would like to share with you the status of the module. Many bugs have been fixed recently. I suppose that new bugs are found because new developers started to play with asyncio since Python 3.4 beta 1. asyncio issues fixed in Python 3.4 beta 3, in a random order: - I wrote most of the asyncio documentation, please help me to improve it! I tried to add many short examples, each time explaining one feature or concept (ex: callbacks, signals, futures, etc.): http://docs.python.org/dev/library/asyncio.html - Characters devices (TTY/PTY) are now supported, useful to control real terminal (not pipes) for subprocesses. On Mac OS X older than Maverick (10.9), the SelectSelector should be used instead of KqueueSelector (kqueue didnd't support character devices) - Tulip #111: StreamReader.readexactly() now raises an IncompleteReadError if the end of stream is reached before we received enough bytes, instead of returning less bytes than requested. - Python #20311: asyncio had a performance issue related to the resolution of selectors and clocks. For example, selectors.EpollSelector has a resolution of 1 millisecond (10^-3), whereas asyncio uses arbitrary timestamps. The issue was fixed by adding a resolution attribute to selectors and a private granularity attribute to asyncio.BaseEventLoop, and use the granularity in asyncio event loop to round times. - New Task.current_task() class method - Guido wrote a web crawler, see examples/crawl.py in Tulip - More symbols are exported in the main asyncio module (ex: Queue, iscouroutine(), etc.) - Charles-François improved the signal handlers: SA_RESTART flag is now set to limit EINTR errors in syscalls - Some optimizations (ex: don't call logger.log() when it's not needed) - Many bugfixes - (sorry if I forgot other changes, see also Tulip history and history of the asyncio module in Python) I also would like to change asyncio to support a "stream-like" API for subprocesses, see Tulip issue #115 (and Python issue #20400): http://code.google.com/p/tulip/issues/detail?id=115 I ported ayncio on Python 2.6 and 2.7, because today OpenStack only uses these Python versions. I created a new project called "Trollius" (instead of "Tulip") because the syntax is a little bit different. "yield from" becomes "yield", and "return x" becomes "raise Return(x)": https://bitbucket.org/enovance/trollius https://pypi.python.org/pypi/trollius If you are interested by the OpenStack part, see my blueprint (something similar to PEPs but for smaller changes) for Oslo Messaing: https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio There is an ongoing effort to port OpenStack to Python 3, eNovance is also working on the portage: https://wiki.openstack.org/wiki/Python3 Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On Mon, 27 Jan 2014 10:45:37 +0100 Victor Stinner wrote: > > - Tulip #111: StreamReader.readexactly() now raises an > IncompleteReadError if the > end of stream is reached before we received enough bytes, instead of returning > less bytes than requested. Why not simply EOFError? Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Antoine Pitrou : > On Mon, 27 Jan 2014 10:45:37 +0100 > Victor Stinner wrote: >> >> - Tulip #111: StreamReader.readexactly() now raises an >> IncompleteReadError if the >> end of stream is reached before we received enough bytes, instead of >> returning >> less bytes than requested. > > Why not simply EOFError? IncompleteReadError has two additionnal attributes: - partial: "incomplete" received bytes - expected: total number of expected bytes (n parameter of readexactly) I prefer to use a different exception to ensure that these attributes are present. I don't like having to check "hasattr(exc, ...)". Before this change, readexactly(n) returned the partial received bytes if the end of the stream was reached. Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On 27 January 2014 10:55, Victor Stinner wrote: > 2014-01-27 Antoine Pitrou : > > On Mon, 27 Jan 2014 10:45:37 +0100 > > Victor Stinner wrote: > >> > >> - Tulip #111: StreamReader.readexactly() now raises an > >> IncompleteReadError if the > >> end of stream is reached before we received enough bytes, instead of > returning > >> less bytes than requested. > > > > Why not simply EOFError? > > IncompleteReadError has two additionnal attributes: > > - partial: "incomplete" received bytes > - expected: total number of expected bytes (n parameter of readexactly) > > I prefer to use a different exception to ensure that these attributes > are present. I don't like having to check "hasattr(exc, ...)". > > Before this change, readexactly(n) returned the partial received bytes > if the end of the stream was reached. > I had the same doubt. Note also that IncompleteReadError is a subclass of EOFError, so you can catch EOFError if you like. > > Victor > ___ > Python-Dev mailing list > [email protected] > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > -- Gustavo J. A. M. Carneiro Gambit Research LLC "The universe is always one step beyond logic." -- Frank Herbert ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Gustavo Carneiro : >> > Why not simply EOFError? >> >> IncompleteReadError has two additionnal attributes: >> >> - partial: "incomplete" received bytes >> - expected: total number of expected bytes (n parameter of readexactly) >> >> I prefer to use a different exception to ensure that these attributes >> are present. I don't like having to check "hasattr(exc, ...)". >> >> Before this change, readexactly(n) returned the partial received bytes >> if the end of the stream was reached. > > I had the same doubt. Note also that IncompleteReadError is a subclass of > EOFError, so you can catch EOFError if you like. Oops, I forgot to mention that :-) I just documented the new IncompleteReadError in asyncio doc. Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 01:39 AM, Antoine Pitrou wrote: On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings wrote: On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok mailto:[email protected]>> wrote: In case we are taking "not backporting anything at all" road, what is the best fix for the document? I would say no fix is needed for this doc because the signature suggests (correctly) that passing times by keyword is not supported. Where does it do that? In the "[,times]" spelling, which is the spelling customarily used for positional-only arguments. That's not my experience. It's very common--in fact I believe more common--for functions that only accept positional arguments to *not* use the square-brackets-for-optional-parameters convention. The square-brackets-for-optional-parameters convention is not legal Python syntax, so I observe that documentation authors avoid it when they can, preferring to express their function's signature in real Python. As an example, consider "heapq.nlargest(n, iterable, key=None)". The implementation uses PyArg_ParseTuple to parse its parameters, and therefore does not accept keyword arguments. But--no square brackets. My experience is that the doc convention of square-brackets-for-optional-parameters is primarily used in two circumstances: one, when doing something really crazy like optional groups, and two, when the default value of one of the function's parameters is inconvenient to specify as a Python value. Of these two the second is far more common. An example of this latter case is zlib.compressobj(). The documentation shows its last parameter as "[, zdict]". However, the implementation parses uses PyArg_ParseTupleAndKeywords(), and therefore accepts keyword arguments. Furthermore, this notation simply cannot be used for functions that have only required parameters. You can't look at the constructor for "memoryview(object)" and determine whether or not it accepts keyword arguments. (It does.) There seems to be no strong correlation between functions that only accept positional-only parameters and functions whose documentation uses square-brackets-for-optional-parameters. Indeed, this is one of the things that can be frustrating about Python, which is why I hope we can make Python 3.5 more predictable in this area. //arry/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 04:01:02 -0800 Larry Hastings wrote: > > On 01/27/2014 01:39 AM, Antoine Pitrou wrote: > > On Sun, 26 Jan 2014 21:01:08 -0800 > > Larry Hastings wrote: > >> On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: > >>> On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok > >>> mailto:[email protected]>> wrote: > >>> > >>> In case we are taking "not backporting anything at all" road, what is > >>> the best fix for the document? > >>> > >>> > >>> I would say no fix is needed for this doc because the signature > >>> suggests (correctly) that passing times by keyword is not supported. > >> Where does it do that? > > In the "[,times]" spelling, which is the spelling customarily used for > > positional-only arguments. > > That's not my experience. But it's mine :-) (try "help(str)" or "help(list)") That said, it's fair to say that whatever convention there is isn't very strictly followed on this particular point. Regards Antoine. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 5:38 PM, Antoine Pitrou wrote:
>
> I would say not backport at all. The security threat is highly
> theoretical. If someone blindly accepts user values for repeat(), the
> user value can just as well be a very large positive with similar
> effects (e.g. 2**31).
>
I can not comment about whether this is security issue or not. But the
effect of large positive number is not similar to the effect of
unlimited repetitions.
>>> from itertools import repeat
>>> list(repeat('a', 2**31))
Traceback (most recent call last):
File "", line 1, in
MemoryError
>>> list(repeat('a', 2**99))
Traceback (most recent call last):
File "", line 1, in
OverflowError: Python int too large to convert to C ssize_t
>>> list(repeat('a', times=-1))
...this freezes my computer...
That is why I prefer we backport the fix (either partial or full). If
not, giving a big warning in the documentation should suffice.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
27.01.14 12:55, Victor Stinner написав(ла): IncompleteReadError has two additionnal attributes: - partial: "incomplete" received bytes - expected: total number of expected bytes (n parameter of readexactly) This looks similar to http.client.IncompleteRead. ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 01:47 AM, Mark Lawrence wrote:
On 27/01/2014 01:52, Nick Coghlan wrote:
In 3.5, that will be passing None, rather than -1. For those proposing
to change the maintenance releases, how should a user relying on this
misbehaviour update their code to handle it?
I'm -1 on using None. The code currently rejects anything except an
int. The docs don't say anything about using None, except in the
"equivalent to" section, which is also the only place where it looks
as if times can be a keyword argument.
The docs describe the signature of itertools.repeat twice: the first
time as its heading (" itertools.repeat(object[, times])"), the second
time as an example implementation asserted to be equivalent to Python's
implementation. These two signatures are not identical, but they are
compatible. You state that we should pay attention to the first and
ignore the second. How did you arrive at that conclusion?
Also, you say something strange like "which is also the only place where
it looks as if times can be a keyword argument.". I don't see a point
over debating whether or not "times" *looks* like it can be a keyword
argument. itertools.repeat() has accepted keyword arguments since 2.7.
The code currently has semantics that cannot be accurately represented
in a Python signature. We could do one of three things:
1) Do nothing, and don't allow inspect.Signature to produce a signature
for the function. This is the status quo.
2) Change the semantics of the function in a non-backwards-compatible
way so that we can represent its signature accurately with an
inspect.Signature object. For example, "change the function so that
providing times=-1 as a keyword argument behaves the same as providing
times=-1 as a positional-only argument" is such an incompatible change.
Another is "change the function to not accept keyword arguments at all".
3) Change the semantics of the function in a backwards-compatible way so
that we can represent its supported signature accurately with an
inspect.Signature object. Allow continued use of the old semantics for
a full deprecation cycle (two major versions) if not longer. For
example, "change the times argument to have a default of None, and
change the logic so that times=None means it repeats forever" would be
such an approach.
For 3.3 and 3.4, I suggest that only 1) makes sense. For 3.5 I prefer
3), specifically the "times=None" approach, as that's how the function
has been documented as working since the itertools module was first
introduce in 2.3. And I see functions as having accurate signatures as
a good thing.
I'm against 2), as I'm against removing functionality simply for
purity's sakes. Removing functionality breaks code. So it's best
reserved for critical problems like security issues. I cite the thread
we just had in python-dev, subject line "Deprecation policy", as an
excellent discussion and summary of this topic.
//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 20:22:53 +0800
Vajrasky Kok wrote:
>
> >>> from itertools import repeat
> >>> list(repeat('a', 2**31))
> Traceback (most recent call last):
> File "", line 1, in
> MemoryError
Sure, just adjust the number to fit the available memory (here, 2**29
does the trick).
Regards
Antoine.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27 January 2014 22:29, Antoine Pitrou wrote:
> On Mon, 27 Jan 2014 20:22:53 +0800
> Vajrasky Kok wrote:
>>
>> >>> from itertools import repeat
>> >>> list(repeat('a', 2**31))
>> Traceback (most recent call last):
>> File "", line 1, in
>> MemoryError
>
> Sure, just adjust the number to fit the available memory (here, 2**29
> does the trick).
And for anyone interested in why a sufficiently large positive value
that won't fit in available RAM fails gracefully with MemoryError:
>>> repeat('a', 2**31).__length_hint__()
2147483648
>>> repeat('a', -1).__length_hint__()
0
list() uses __length_hint__() for preallocation, so a sufficiently
large length hint means the preallocation attempt fails with
MemoryError. As Antoine showed though, you still can't feed it
untrusted data, because a large enough value that just fits into RAM
can still cause you a lot of grief.
Everything points to "times=-1" behaving as it does being a bug, but
not a sufficiently critical one to risk breaking working code in a
maintenance release. That makes deprecating the current behaviour of
"times=-1" and accepting "times=None" in 3.5 the least controversial
course of action.
Cheers,
Nick.
--
Nick Coghlan | [email protected] | Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 04:29:04AM -0800, Larry Hastings wrote: > The code currently has semantics that cannot be accurately represented > in a Python signature. We could do one of three things: > > 1) Do nothing, and don't allow inspect.Signature to produce a signature > for the function. This is the status quo. > > 2) Change the semantics of the function in a non-backwards-compatible > way so that we can represent its signature accurately with an > inspect.Signature object. For example, "change the function so that > providing times=-1 as a keyword argument behaves the same as providing > times=-1 as a positional-only argument" is such an incompatible change. > Another is "change the function to not accept keyword arguments at all". > > 3) Change the semantics of the function in a backwards-compatible way so > that we can represent its supported signature accurately with an > inspect.Signature object. Allow continued use of the old semantics for > a full deprecation cycle (two major versions) if not longer. For > example, "change the times argument to have a default of None, and > change the logic so that times=None means it repeats forever" would be > such an approach. > > For 3.3 and 3.4, I suggest that only 1) makes sense. Are you rejecting the idea that the current behaviour is an out and out buggy, and therefore fixing these things can and should occur in a bug-fix release? As far as I can see, the only piece of evidence that the given behaviour isn't a bug is that the signature says "object [, times]" rather than "object, times=None". That's not conclusive: I've often written signatures using [ ] to indicate optional arguments without specifying the default value in the signature. As it stands now, the documentation is internally contradictory. In one part of the documentation, it gives a clear indication that "times is None" should select the repeat forever behaviour. In another part of the documentation, it fails to mention that None is an acceptable value to select the repeat forever behaviour. > For 3.5 I prefer > 3), specifically the "times=None" approach, as that's how the function > has been documented as working since the itertools module was first > introduce in 2.3. And I see functions as having accurate signatures as > a good thing. > > I'm against 2), as I'm against removing functionality simply for > purity's sakes. Removing functionality breaks code. So it's best > reserved for critical problems like security issues. I cite the thread > we just had in python-dev, subject line "Deprecation policy", as an > excellent discussion and summary of this topic. I'm confused... you seem to be saying that you are *against* changing the behaviour of repeat so that: repeat(x, -1) and repeat(x, times=-1) behave the same. Is that actually what you mean, or have I misunderstood? Are there any other functions in the standard library where the behaviour differs depending on whether an argument is given positionally or by keyword? -- Steven ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 04:56 AM, Steven D'Aprano wrote (rearranged slightly so I could make my points in order): I'm confused... you seem to be saying that you are *against* changing the behaviour of repeat so that: repeat(x, -1) and repeat(x, times=-1) behave the same. Is that actually what you mean, or have I misunderstood? I apologize for not making myself clear. But that's part of what I meant, yes: we should preserve the existing behavior of times=-1 when passed in by position or by keyword. However, we should *also* add a deprecation warning when passing times=-1 by keyword, suggesting that they use times=None instead. The idea is that we could eventually remove the PyTuple_Size check and make times=-1 always behave like times=0. In practice it'd be okay with me if we never did, or at least not until Python 4. Are you rejecting the idea that the current behaviour is an out and out buggy, and therefore fixing these things can and should occur in a bug-fix release? While it's a bug, it's a very minor bug. As Python 3.4 release manager, my position is: Python 3.4 is in beta, so let's not change semantics for purity's sakes now. I'm -0.5 on adding times=None right now, and until we do we can't deprecate the old behavior. Are there any other functions in the standard library where the behaviour differs depending on whether an argument is given positionally or by keyword? Not that I know of. This instance seems to be purely unintentional; see my latest message on the relevant issue, where I went back and figured out why itertools.repeat behaves like this in the first place: http://bugs.python.org/issue19145#msg209440 Cheers, //arry/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Serhiy Storchaka : > 27.01.14 12:55, Victor Stinner написав(ла): > >> IncompleteReadError has two additionnal attributes: >> >> - partial: "incomplete" received bytes >> - expected: total number of expected bytes (n parameter of readexactly) > > This looks similar to http.client.IncompleteRead. Please read the original issue for more information: http://code.google.com/p/tulip/issues/detail?id=111 I mentionned http.client.IncompleRead exception there. The HTTP exception is similar but also different: - asyncio.IncompleReadError inherits from EOFError, not from HTTPException (which inherits from Exception) - asyncio.IncompleReadError.expected is the total expected size, not the remaining size Victor ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27/01/2014 12:56, Steven D'Aprano wrote: As it stands now, the documentation is internally contradictory. In one part of the documentation, it gives a clear indication that "times is None" should select the repeat forever behaviour. In another part of the documentation, it fails to mention that None is an acceptable value to select the repeat forever behaviour. None is not currently an acceptable value, ValueError is raised if you provide anything other than an int in both Python 2.7 and 3.3. That's why I'm against using it to say "run forever" in Python 3.5. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi,
I'm surprised: marshal.dumps() doesn't raise an error if you pass an
invalid version. In fact, Python 3.3 only supports versions 0, 1 and
2. If you pass 3, it will use the version 2. (Same apply for version
99.)
Python 3.4 has two new versions: 3 and 4. The version 3 "shares common
object references", the version 4 adds short tuples and short strings
(produce smaller files).
It would be nice to document the differences between marshal versions.
And what do you think of raising an error if the version is unknown in
marshal.dumps()?
I modified your benchmark to test also loads() and run the benchmark
10 times. Results:
---
Python 3.3.3+ (3.3:50aa9e3ab9a4, Jan 27 2014, 16:11:26)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux
dumps v0: 391.9 ms
data size v0: 45582.9 kB
loads v0: 616.2 ms
dumps v1: 384.3 ms
data size v1: 45582.9 kB
loads v1: 594.0 ms
dumps v2: 153.1 ms
data size v2: 41395.4 kB
loads v2: 549.6 ms
dumps v3: 152.1 ms
data size v3: 41395.4 kB
loads v3: 535.9 ms
dumps v4: 152.3 ms
data size v4: 41395.4 kB
loads v4: 549.7 ms
---
And:
---
Python 3.4.0b3+ (default:dbad4564cd12, Jan 27 2014, 16:09:40)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux
dumps v0: 389.4 ms
data size v0: 45582.9 kB
loads v0: 564.8 ms
dumps v1: 390.2 ms
data size v1: 45582.9 kB
loads v1: 545.6 ms
dumps v2: 165.5 ms
data size v2: 41395.4 kB
loads v2: 470.9 ms
dumps v3: 425.6 ms
data size v3: 41395.4 kB
loads v3: 528.2 ms
dumps v4: 369.2 ms
data size v4: 37000.9 kB
loads v4: 550.2 ms
---
Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with
Python 3.4 produces the smallest file.
Victor
2014-01-27 Wolfgang :
> Hi,
>
> I tested the latest beta from 3.4 (b3) and noticed there is a new marshal
> protocol version 3.
> The documentation is a little silent about the new features, not going into
> detail.
>
> I've run a performance test with the new protocol version and noticed the
> new version is two times slower in serialization than version 2. I tested it
> with a simple value tuple in a list (50 elements).
> Nothing special. (happens only if the tuple contains also a tuple)
>
> Copy of the test code:
>
>
> from time import time
> from marshal import dumps
>
> def genData(amount=50):
> for i in range(amount):
> yield (i, i+2, i*2, (i+1,i+4,i,4), "my string template %s" % i, 1.01*i,
> True)
>
> data = list(genData())
> print(len(data))
> t0 = time()
> result = dumps(data, 2)
> t1 = time()
> print("duration p2: %f" % (t1-t0))
> t0 = time()
> result = dumps(data, 3)
> t1 = time()
> print("duration p3: %f" % (t1-t0))
>
>
>
> Is the overhead for the recursion detection so high ?
>
> Note this happens only if there is a tuple in the tuple of the datalist.
>
>
> Regards,
>
> Wolfgang
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
>
import time
import marshal
def genData(amount=50):
for i in range(amount):
yield (i, i+2, i*2, (i+1,i+4,i,4), "my string template %s" % i, 1.01*i, True)
data = list(genData())
def bench(text, func, *args):
times = []
for run in range(5):
t0 = time.perf_counter()
result = func(*args)
dt = time.perf_counter() - t0
times.append(dt)
print("%s: %.1f ms" % (text, min(times) * 1e3))
return result
def bench_marshal(version, obj):
data = bench("dumps v%s" % version, marshal.dumps, obj, version)
print("data size v%s: %.1f kB" % (version, len(data) / 1024))
bench("loads v%s" % version, marshal.loads, data)
print("")
for version in range(5):
bench_marshal(version, data)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
On 27 January 2014 15:35, Victor Stinner wrote: > Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with > Python 3.4 produces the smallest file. Which version is used when creating pyc files? This benchmark might suggest that version 2 is the best... Paul ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
On Mon, Jan 27, 2014 at 10:42 AM, Paul Moore wrote: > On 27 January 2014 15:35, Victor Stinner wrote: > > Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with > > Python 3.4 produces the smallest file. > > Which version is used when creating pyc files? This benchmark might > suggest that version 2 is the best... > Importlib just uses the default: http://hg.python.org/cpython/file/dbad4564cd12/Lib/importlib/_bootstrap.py#l671 ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi,
I tested the latest beta from 3.4 (b3) and noticed there is a new marshal
protocol version 3.
The documentation is a little silent about the new features, not going into
detail.
I've run a performance test with the new protocol version and noticed the
new version is two times slower in serialization than version 2. I tested
it with a simple value tuple in a list (50 elements).
Nothing special. (happens only if the tuple contains also a tuple)
Copy of the test code:
from time import time
from marshal import dumps
def genData(amount=50):
for i in range(amount):
yield (i, i+2, i*2, (i+1,i+4,i,4), "my string template %s" % i, 1.01*i,
True)
data = list(genData())
print(len(data))
t0 = time()
result = dumps(data, 2)
t1 = time()
print("duration p2: %f" % (t1-t0))
t0 = time()
result = dumps(data, 3)
t1 = time()
print("duration p3: %f" % (t1-t0))
Is the overhead for the recursion detection so high ?
Note this happens only if there is a tuple in the tuple of the datalist.
Regards,
Wolfgang
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
Am 27.01.2014 13:12, schrieb Antoine Pitrou: > On Mon, 27 Jan 2014 04:01:02 -0800 > Larry Hastings wrote: >> >> On 01/27/2014 01:39 AM, Antoine Pitrou wrote: >> > On Sun, 26 Jan 2014 21:01:08 -0800 >> > Larry Hastings wrote: >> >> On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: >> >>> On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok >> >>> mailto:[email protected]>> wrote: >> >>> >> >>> In case we are taking "not backporting anything at all" road, what >> >>> is >> >>> the best fix for the document? >> >>> >> >>> >> >>> I would say no fix is needed for this doc because the signature >> >>> suggests (correctly) that passing times by keyword is not supported. >> >> Where does it do that? >> > In the "[,times]" spelling, which is the spelling customarily used for >> > positional-only arguments. >> >> That's not my experience. > > But it's mine :-) (try "help(str)" or "help(list)") It's also the convention we've been using for the docs. Georg ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On Mon, Jan 27, 2014 at 5:21 AM, Victor Stinner wrote: > - asyncio.IncompleReadError.expected is the total expected size, not > the remaining size Why not be consistent with the meaning of http.client.IncompleteRead.expected? The current meaning can be recovered via len(e.partial) + e.expected. -- Devin ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Thanks Victor for improving this. I also have to note, version 3 is only in the case of tuple in tuple slower. If you use a flat tuple it is faster than version 2. So I asked for this corner case and thought the recursion detection or something else has a huge cost. For pyc files, I think the highest available version is the used default. I didn't know version 4, nowhere mentioned in the docs. Also figured out, that every integer is accepted as protocol version. But was usable for tests against 3.3 and 2.7. :-) On Mon, Jan 27, 2014 at 5:02 PM, Brett Cannon wrote: > > > > On Mon, Jan 27, 2014 at 10:42 AM, Paul Moore wrote: > >> On 27 January 2014 15:35, Victor Stinner >> wrote: >> > Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with >> > Python 3.4 produces the smallest file. >> >> Which version is used when creating pyc files? This benchmark might >> suggest that version 2 is the best... >> > > Importlib just uses the default: > http://hg.python.org/cpython/file/dbad4564cd12/Lib/importlib/_bootstrap.py#l671 > > -- bye by Wolfgang ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
27.01.14 17:35, Victor Stinner написав(ла): Python 3.4 has two new versions: 3 and 4. The version 3 "shares common object references", the version 4 adds short tuples and short strings (produce smaller files). Why we need two new versions added in one Python release? ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings wrote:
>
> I apologize for not making myself clear. But that's part of what I meant,
> yes: we should preserve the existing behavior of times=-1 when passed in by
> position or by keyword. However, we should *also* add a deprecation warning
> when passing times=-1 by keyword, suggesting that they use times=None
> instead. The idea is that we could eventually remove the PyTuple_Size check
> and make times=-1 always behave like times=0. In practice it'd be okay with
> me if we never did, or at least not until Python 4.
>
So we only add deprecation warning to only times=-1 via keyword or for
all negative numbers to times via keyword?
I mean, what about:
>>> from itertools import repeat
>>> list(repeat('a', times=-2))
Traceback (most recent call last):
File "", line 1, in
OverflowError: Python int too large to convert to C ssize_t
Deprecation warning or not?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 8:29 PM, Antoine Pitrou wrote:
>
> Sure, just adjust the number to fit the available memory (here, 2**29
> does the trick).
>
I get your point. But strangely enough, I can still recover from
list(repeat('a', 2**29)). It only slows down my computer. I can ^Z the
application then kill it later. But with list(repeat('a', times=-1)),
rebooting the machine is compulsory.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 06:00 PM, Vajrasky Kok wrote:
On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings wrote:
I apologize for not making myself clear. But that's part of what I meant,
yes: we should preserve the existing behavior of times=-1 when passed in by
position or by keyword. However, we should *also* add a deprecation warning
when passing times=-1 by keyword, suggesting that they use times=None
instead. The idea is that we could eventually remove the PyTuple_Size check
and make times=-1 always behave like times=0. In practice it'd be okay with
me if we never did, or at least not until Python 4.
So we only add deprecation warning to only times=-1 via keyword or for
all negative numbers to times via keyword?
I mean, what about:
from itertools import repeat
list(repeat('a', times=-2))
I should have been even *more* precise! When I said "times=-1" I really
meant all negative numbers. (I was trying to abbreviate it as -1, as my
text was already too long and unwieldly.)
I propose the logic be equivalent to this, handwaving for clarity
boilerplate error handling (the real implementation would handle
PyArg_ParseParseTupleAndKeywords or PyLong_ToPy_ssize_t failing):
PyObject *element, times = Py_None;
Py_ssize_t cnt;
PyArg_ParseTupleAndKeywords(args, kwargs, "O|O:repeat", kwargs,
&element, ×);
if times == Py_None
cnt = -1
else
cnt = PyLong_ToPy_ssize_t(times)
if cnt < 0
if "times" was passed by keyword
issue DeprecationWarning, "use repeat(o, times=None) to
repeat indefinitely"
else
cnt = 0
(For those of you who aren't familiar with the source: "cnt" is the
internal variable used to set the repeat count of the iterator. If
"cnt" is < 0, the iterator repeats forever.)
If in the future we actually removed the deprecated behavior, the last
"if" block would change simply to
if cnt < 0
cnt = 0
//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings wrote:
>
>
> While it's a bug, it's a very minor bug. As Python 3.4 release manager, my
> position is: Python 3.4 is in beta, so let's not change semantics for
> purity's sakes now. I'm -0.5 on adding times=None right now, and until we
> do we can't deprecate the old behavior.
>
>
I bow to your decision, Larry. So I believe the doc fix is required
then. I propose these for doc fix:
1. Keeps the status quo
=
>>> repeat.__doc__
'repeat(object [,times]) -> create an iterator which returns the
object\nfor the specified number of times. If not specified, returns
the object\nendlessly.'
We don't explain the meaning of negative `times`. Well, people
shouldn't repeat with negative times because statement such as, "Kids,
repeat the push-up
negative two times more.", does not make sense.
2. Explains the negative times, ignores the keyword
=
>>> repeat.__doc__
'repeat(object [,times]) -> create an iterator which returns the
object\nfor the specified number of times. If not specified, returns
the object\nendlessly.
Negative times means zero repetitions.'
The signature repeat(object [,times]) suggest this function does not
accept keyword as some core developers have stated. So if the user
uses keyword with this function,
well, it's too bad for them.
3. Explains the negative times, warns about keyword
==
>>> repeat.__doc__
'repeat(object [,times]) -> create an iterator which returns the
object\nfor the specified number of times. If not specified, returns
the object\nendlessly.
Negative times means zero repetitions. This function accepts keyword
argument but the behaviour is buggy and should be avoided.'
4. Explains everything
>>> repeat.__doc__
'repeat(object [,times]) -> create an iterator which returns the
object\nfor the specified number of times. If not specified, returns
the object\nendlessly.
Negative times means zero repetitions via positional-only arguments.
-1 value for times via keyword means endless repetitions and is same
as omitting
times argument and other negative number for times means endless
repetitions as well but with different implementation.'
If you are wondering about the last statement:
>>> from itertools import repeat
>>> list(repeat('a', times=-4))
Traceback (most recent call last):
File "", line 1, in
OverflowError: Python int too large to convert to C ssize_t
>>> a = repeat('a', times=-4)
>>> next(a)
'a'
>>> next(a)
'a'
>>> a = repeat('a', times=-1)
>>> next(a)
'a'
>>> next(a)
'a'
>>> list(repeat('a', times=-1))
... freezes your computer ...
Which one is better? Once we settle this, I can think about the doc
fix for Doc/library/itertools.rst.
Vajrasky
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 06:26 PM, Vajrasky Kok wrote: So I believe the doc fix is required then. I propose the docstring should describe only supported behavior, and the docs in the manual should mention the unsupported behavior. However, I'm interested in Raymond's take, as he's the original author of itertools.repeat. If I were writing it, it might well come out like this: docstring: repeat(object [,times]) -> iterator Return an iterator which yields the object for the specified number of times. If times is unspecified, yields the object forever. If times is negative, behave as if times is 0. documentation: repeat(object [,times]) -> iterator Return an iterator which yields the object for the specified number of times. If times is unspecified, yields the object forever. If times is negative, behave as if times is 0. Equivalent to: def repeat(object, times=None): # repeat(10, 3) --> 10 10 10 if times is None: while True: yield object else: for i in range(times): yield object A common use for repeat is to supply a stream of constant values to map or zip: >>> >>> list(map(pow, range(10), repeat(2))) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] .. note: if "times" is specified using a keyword argument, and provided with a negative value, repeat yields the object forever. This is a bug, its use is unsupported, and this behavior may be removed in a future version of Python. //arry/ ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi there. I think you should modify your program to marshal (and load) a compiled module. This is where the optimizations in versions 3 and 4 become important. K > -Original Message- > From: Python-Dev [mailto:python-dev- > [email protected]] On Behalf Of Victor Stinner > Sent: Monday, January 27, 2014 23:35 > To: Wolfgang > Cc: Python-Dev > Subject: Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 > protocol) > > Hi, > > I'm surprised: marshal.dumps() doesn't raise an error if you pass an invalid > version. In fact, Python 3.3 only supports versions 0, 1 and 2. If you pass > 3, it > will use the version 2. (Same apply for version > 99.) > > Python 3.4 has two new versions: 3 and 4. The version 3 "shares common > object references", the version 4 adds short tuples and short strings > (produce smaller files). > > It would be nice to document the differences between marshal versions. > > And what do you think of raising an error if the version is unknown in > marshal.dumps()? > > I modified your benchmark to test also loads() and run the benchmark > 10 times. Results: > --- > Python 3.3.3+ (3.3:50aa9e3ab9a4, Jan 27 2014, 16:11:26) [GCC 4.8.2 20131212 > (Red Hat 4.8.2-7)] on linux > > dumps v0: 391.9 ms > data size v0: 45582.9 kB > loads v0: 616.2 ms > > dumps v1: 384.3 ms > data size v1: 45582.9 kB > loads v1: 594.0 ms > > dumps v2: 153.1 ms > data size v2: 41395.4 kB > loads v2: 549.6 ms > > dumps v3: 152.1 ms > data size v3: 41395.4 kB > loads v3: 535.9 ms > > dumps v4: 152.3 ms > data size v4: 41395.4 kB > loads v4: 549.7 ms > --- > > And: > --- > Python 3.4.0b3+ (default:dbad4564cd12, Jan 27 2014, 16:09:40) [GCC 4.8.2 > 20131212 (Red Hat 4.8.2-7)] on linux > > dumps v0: 389.4 ms > data size v0: 45582.9 kB > loads v0: 564.8 ms > > dumps v1: 390.2 ms > data size v1: 45582.9 kB > loads v1: 545.6 ms > > dumps v2: 165.5 ms > data size v2: 41395.4 kB > loads v2: 470.9 ms > > dumps v3: 425.6 ms > data size v3: 41395.4 kB > loads v3: 528.2 ms > > dumps v4: 369.2 ms > data size v4: 37000.9 kB > loads v4: 550.2 ms > --- > > Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 > produces the smallest file. > > Victor > > 2014-01-27 Wolfgang : > > Hi, > > > > I tested the latest beta from 3.4 (b3) and noticed there is a new > > marshal protocol version 3. > > The documentation is a little silent about the new features, not going > > into detail. > > > > I've run a performance test with the new protocol version and noticed > > the new version is two times slower in serialization than version 2. I > > tested it with a simple value tuple in a list (50 elements). > > Nothing special. (happens only if the tuple contains also a tuple) > > > > Copy of the test code: > > > > > > from time import time > > from marshal import dumps > > > > def genData(amount=50): > > for i in range(amount): > > yield (i, i+2, i*2, (i+1,i+4,i,4), "my string template %s" % i, > > 1.01*i, > > True) > > > > data = list(genData()) > > print(len(data)) > > t0 = time() > > result = dumps(data, 2) > > t1 = time() > > print("duration p2: %f" % (t1-t0)) > > t0 = time() > > result = dumps(data, 3) > > t1 = time() > > print("duration p3: %f" % (t1-t0)) > > > > > > > > Is the overhead for the recursion detection so high ? > > > > Note this happens only if there is a tuple in the tuple of the datalist. > > > > > > Regards, > > > > Wolfgang > > > > > > ___ > > Python-Dev mailing list > > [email protected] > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > https://mail.python.org/mailman/options/python- > dev/victor.stinner%40gm > > ail.com > > ___ Python-Dev mailing list [email protected] https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
