Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 14:02:29 +1000 Nick Coghlan ncogh...@gmail.com wrote: If we do go this path, then we should backport the full fix (i.e. accepting None to indicate repeating forever), rather than just a partial fix. That is, I'm OK with either not backporting anything at all, or backporting the full change. The only idea I object to is the one of removing the infinite iteration capability without providing a replacement spelling for it. I would say not backport at all. The security threat is highly theoretical. If someone blindly accepts user values for repeat(), the user value can just as well be a very large positive with similar effects (e.g. 2**31). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings la...@hastings.org wrote: On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok sky@speaklikeaking.com mailto:sky@speaklikeaking.com wrote: In case we are taking not backporting anything at all road, what is the best fix for the document? I would say no fix is needed for this doc because the signature suggests (correctly) that passing times by keyword is not supported. Where does it do that? In the [,times] spelling, which is the spelling customarily used for positional-only arguments. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27/01/2014 01:52, Nick Coghlan wrote: In 3.5, that will be passing None, rather than -1. For those proposing to change the maintenance releases, how should a user relying on this misbehaviour update their code to handle it? I'm -1 on using None. The code currently rejects anything except an int. The docs don't say anything about using None, except in the equivalent to section, which is also the only place where it looks as if times can be a keyword argument. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] News from asyncio
Hi, I'm working for eNovance on the asyncio module, the goal is to use it in the huge OpenStack project (2.5 millions line of code) which currently uses eventlet. I'm trying to fix remaining issues in the asyncio module before Python 3.4 final. The asyncio project is very active but discussions are splitted between its own dedicated mailing list (python-tulip Google group), Tulip bug tracker and Python bug tracker. Please join Tulip mailing list if you are interested to contribute. http://code.google.com/p/tulip/ I would like to share with you the status of the module. Many bugs have been fixed recently. I suppose that new bugs are found because new developers started to play with asyncio since Python 3.4 beta 1. asyncio issues fixed in Python 3.4 beta 3, in a random order: - I wrote most of the asyncio documentation, please help me to improve it! I tried to add many short examples, each time explaining one feature or concept (ex: callbacks, signals, futures, etc.): http://docs.python.org/dev/library/asyncio.html - Characters devices (TTY/PTY) are now supported, useful to control real terminal (not pipes) for subprocesses. On Mac OS X older than Maverick (10.9), the SelectSelector should be used instead of KqueueSelector (kqueue didnd't support character devices) - Tulip #111: StreamReader.readexactly() now raises an IncompleteReadError if the end of stream is reached before we received enough bytes, instead of returning less bytes than requested. - Python #20311: asyncio had a performance issue related to the resolution of selectors and clocks. For example, selectors.EpollSelector has a resolution of 1 millisecond (10^-3), whereas asyncio uses arbitrary timestamps. The issue was fixed by adding a resolution attribute to selectors and a private granularity attribute to asyncio.BaseEventLoop, and use the granularity in asyncio event loop to round times. - New Task.current_task() class method - Guido wrote a web crawler, see examples/crawl.py in Tulip - More symbols are exported in the main asyncio module (ex: Queue, iscouroutine(), etc.) - Charles-François improved the signal handlers: SA_RESTART flag is now set to limit EINTR errors in syscalls - Some optimizations (ex: don't call logger.log() when it's not needed) - Many bugfixes - (sorry if I forgot other changes, see also Tulip history and history of the asyncio module in Python) I also would like to change asyncio to support a stream-like API for subprocesses, see Tulip issue #115 (and Python issue #20400): http://code.google.com/p/tulip/issues/detail?id=115 I ported ayncio on Python 2.6 and 2.7, because today OpenStack only uses these Python versions. I created a new project called Trollius (instead of Tulip) because the syntax is a little bit different. yield from becomes yield, and return x becomes raise Return(x): https://bitbucket.org/enovance/trollius https://pypi.python.org/pypi/trollius If you are interested by the OpenStack part, see my blueprint (something similar to PEPs but for smaller changes) for Oslo Messaing: https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio There is an ongoing effort to port OpenStack to Python 3, eNovance is also working on the portage: https://wiki.openstack.org/wiki/Python3 Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On Mon, 27 Jan 2014 10:45:37 +0100 Victor Stinner victor.stin...@gmail.com wrote: - Tulip #111: StreamReader.readexactly() now raises an IncompleteReadError if the end of stream is reached before we received enough bytes, instead of returning less bytes than requested. Why not simply EOFError? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Antoine Pitrou solip...@pitrou.net: On Mon, 27 Jan 2014 10:45:37 +0100 Victor Stinner victor.stin...@gmail.com wrote: - Tulip #111: StreamReader.readexactly() now raises an IncompleteReadError if the end of stream is reached before we received enough bytes, instead of returning less bytes than requested. Why not simply EOFError? IncompleteReadError has two additionnal attributes: - partial: incomplete received bytes - expected: total number of expected bytes (n parameter of readexactly) I prefer to use a different exception to ensure that these attributes are present. I don't like having to check hasattr(exc, ...). Before this change, readexactly(n) returned the partial received bytes if the end of the stream was reached. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On 27 January 2014 10:55, Victor Stinner victor.stin...@gmail.com wrote: 2014-01-27 Antoine Pitrou solip...@pitrou.net: On Mon, 27 Jan 2014 10:45:37 +0100 Victor Stinner victor.stin...@gmail.com wrote: - Tulip #111: StreamReader.readexactly() now raises an IncompleteReadError if the end of stream is reached before we received enough bytes, instead of returning less bytes than requested. Why not simply EOFError? IncompleteReadError has two additionnal attributes: - partial: incomplete received bytes - expected: total number of expected bytes (n parameter of readexactly) I prefer to use a different exception to ensure that these attributes are present. I don't like having to check hasattr(exc, ...). Before this change, readexactly(n) returned the partial received bytes if the end of the stream was reached. I had the same doubt. Note also that IncompleteReadError is a subclass of EOFError, so you can catch EOFError if you like. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com -- Gustavo J. A. M. Carneiro Gambit Research LLC The universe is always one step beyond logic. -- Frank Herbert ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Gustavo Carneiro gjcarne...@gmail.com: Why not simply EOFError? IncompleteReadError has two additionnal attributes: - partial: incomplete received bytes - expected: total number of expected bytes (n parameter of readexactly) I prefer to use a different exception to ensure that these attributes are present. I don't like having to check hasattr(exc, ...). Before this change, readexactly(n) returned the partial received bytes if the end of the stream was reached. I had the same doubt. Note also that IncompleteReadError is a subclass of EOFError, so you can catch EOFError if you like. Oops, I forgot to mention that :-) I just documented the new IncompleteReadError in asyncio doc. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 01:39 AM, Antoine Pitrou wrote: On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings la...@hastings.org wrote: On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok sky@speaklikeaking.com mailto:sky@speaklikeaking.com wrote: In case we are taking not backporting anything at all road, what is the best fix for the document? I would say no fix is needed for this doc because the signature suggests (correctly) that passing times by keyword is not supported. Where does it do that? In the [,times] spelling, which is the spelling customarily used for positional-only arguments. That's not my experience. It's very common--in fact I believe more common--for functions that only accept positional arguments to *not* use the square-brackets-for-optional-parameters convention. The square-brackets-for-optional-parameters convention is not legal Python syntax, so I observe that documentation authors avoid it when they can, preferring to express their function's signature in real Python. As an example, consider heapq.nlargest(n, iterable, key=None). The implementation uses PyArg_ParseTuple to parse its parameters, and therefore does not accept keyword arguments. But--no square brackets. My experience is that the doc convention of square-brackets-for-optional-parameters is primarily used in two circumstances: one, when doing something really crazy like optional groups, and two, when the default value of one of the function's parameters is inconvenient to specify as a Python value. Of these two the second is far more common. An example of this latter case is zlib.compressobj(). The documentation shows its last parameter as [, zdict]. However, the implementation parses uses PyArg_ParseTupleAndKeywords(), and therefore accepts keyword arguments. Furthermore, this notation simply cannot be used for functions that have only required parameters. You can't look at the constructor for memoryview(object) and determine whether or not it accepts keyword arguments. (It does.) There seems to be no strong correlation between functions that only accept positional-only parameters and functions whose documentation uses square-brackets-for-optional-parameters. Indeed, this is one of the things that can be frustrating about Python, which is why I hope we can make Python 3.5 more predictable in this area. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 04:01:02 -0800 Larry Hastings la...@hastings.org wrote: On 01/27/2014 01:39 AM, Antoine Pitrou wrote: On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings la...@hastings.org wrote: On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok sky@speaklikeaking.com mailto:sky@speaklikeaking.com wrote: In case we are taking not backporting anything at all road, what is the best fix for the document? I would say no fix is needed for this doc because the signature suggests (correctly) that passing times by keyword is not supported. Where does it do that? In the [,times] spelling, which is the spelling customarily used for positional-only arguments. That's not my experience. But it's mine :-) (try help(str) or help(list)) That said, it's fair to say that whatever convention there is isn't very strictly followed on this particular point. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 5:38 PM, Antoine Pitrou solip...@pitrou.net wrote: I would say not backport at all. The security threat is highly theoretical. If someone blindly accepts user values for repeat(), the user value can just as well be a very large positive with similar effects (e.g. 2**31). I can not comment about whether this is security issue or not. But the effect of large positive number is not similar to the effect of unlimited repetitions. from itertools import repeat list(repeat('a', 2**31)) Traceback (most recent call last): File stdin, line 1, in module MemoryError list(repeat('a', 2**99)) Traceback (most recent call last): File stdin, line 1, in module OverflowError: Python int too large to convert to C ssize_t list(repeat('a', times=-1)) ...this freezes my computer... That is why I prefer we backport the fix (either partial or full). If not, giving a big warning in the documentation should suffice. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
27.01.14 12:55, Victor Stinner написав(ла): IncompleteReadError has two additionnal attributes: - partial: incomplete received bytes - expected: total number of expected bytes (n parameter of readexactly) This looks similar to http.client.IncompleteRead. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 01:47 AM, Mark Lawrence wrote: On 27/01/2014 01:52, Nick Coghlan wrote: In 3.5, that will be passing None, rather than -1. For those proposing to change the maintenance releases, how should a user relying on this misbehaviour update their code to handle it? I'm -1 on using None. The code currently rejects anything except an int. The docs don't say anything about using None, except in the equivalent to section, which is also the only place where it looks as if times can be a keyword argument. The docs describe the signature of itertools.repeat twice: the first time as its heading ( itertools.repeat(object[, times])), the second time as an example implementation asserted to be equivalent to Python's implementation. These two signatures are not identical, but they are compatible. You state that we should pay attention to the first and ignore the second. How did you arrive at that conclusion? Also, you say something strange like which is also the only place where it looks as if times can be a keyword argument.. I don't see a point over debating whether or not times *looks* like it can be a keyword argument. itertools.repeat() has accepted keyword arguments since 2.7. The code currently has semantics that cannot be accurately represented in a Python signature. We could do one of three things: 1) Do nothing, and don't allow inspect.Signature to produce a signature for the function. This is the status quo. 2) Change the semantics of the function in a non-backwards-compatible way so that we can represent its signature accurately with an inspect.Signature object. For example, change the function so that providing times=-1 as a keyword argument behaves the same as providing times=-1 as a positional-only argument is such an incompatible change. Another is change the function to not accept keyword arguments at all. 3) Change the semantics of the function in a backwards-compatible way so that we can represent its supported signature accurately with an inspect.Signature object. Allow continued use of the old semantics for a full deprecation cycle (two major versions) if not longer. For example, change the times argument to have a default of None, and change the logic so that times=None means it repeats forever would be such an approach. For 3.3 and 3.4, I suggest that only 1) makes sense. For 3.5 I prefer 3), specifically the times=None approach, as that's how the function has been documented as working since the itertools module was first introduce in 2.3. And I see functions as having accurate signatures as a good thing. I'm against 2), as I'm against removing functionality simply for purity's sakes. Removing functionality breaks code. So it's best reserved for critical problems like security issues. I cite the thread we just had in python-dev, subject line Deprecation policy, as an excellent discussion and summary of this topic. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, 27 Jan 2014 20:22:53 +0800 Vajrasky Kok sky@speaklikeaking.com wrote: from itertools import repeat list(repeat('a', 2**31)) Traceback (most recent call last): File stdin, line 1, in module MemoryError Sure, just adjust the number to fit the available memory (here, 2**29 does the trick). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27 January 2014 22:29, Antoine Pitrou solip...@pitrou.net wrote: On Mon, 27 Jan 2014 20:22:53 +0800 Vajrasky Kok sky@speaklikeaking.com wrote: from itertools import repeat list(repeat('a', 2**31)) Traceback (most recent call last): File stdin, line 1, in module MemoryError Sure, just adjust the number to fit the available memory (here, 2**29 does the trick). And for anyone interested in why a sufficiently large positive value that won't fit in available RAM fails gracefully with MemoryError: repeat('a', 2**31).__length_hint__() 2147483648 repeat('a', -1).__length_hint__() 0 list() uses __length_hint__() for preallocation, so a sufficiently large length hint means the preallocation attempt fails with MemoryError. As Antoine showed though, you still can't feed it untrusted data, because a large enough value that just fits into RAM can still cause you a lot of grief. Everything points to times=-1 behaving as it does being a bug, but not a sufficiently critical one to risk breaking working code in a maintenance release. That makes deprecating the current behaviour of times=-1 and accepting times=None in 3.5 the least controversial course of action. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 04:29:04AM -0800, Larry Hastings wrote: The code currently has semantics that cannot be accurately represented in a Python signature. We could do one of three things: 1) Do nothing, and don't allow inspect.Signature to produce a signature for the function. This is the status quo. 2) Change the semantics of the function in a non-backwards-compatible way so that we can represent its signature accurately with an inspect.Signature object. For example, change the function so that providing times=-1 as a keyword argument behaves the same as providing times=-1 as a positional-only argument is such an incompatible change. Another is change the function to not accept keyword arguments at all. 3) Change the semantics of the function in a backwards-compatible way so that we can represent its supported signature accurately with an inspect.Signature object. Allow continued use of the old semantics for a full deprecation cycle (two major versions) if not longer. For example, change the times argument to have a default of None, and change the logic so that times=None means it repeats forever would be such an approach. For 3.3 and 3.4, I suggest that only 1) makes sense. Are you rejecting the idea that the current behaviour is an out and out buggy, and therefore fixing these things can and should occur in a bug-fix release? As far as I can see, the only piece of evidence that the given behaviour isn't a bug is that the signature says object [, times] rather than object, times=None. That's not conclusive: I've often written signatures using [ ] to indicate optional arguments without specifying the default value in the signature. As it stands now, the documentation is internally contradictory. In one part of the documentation, it gives a clear indication that times is None should select the repeat forever behaviour. In another part of the documentation, it fails to mention that None is an acceptable value to select the repeat forever behaviour. For 3.5 I prefer 3), specifically the times=None approach, as that's how the function has been documented as working since the itertools module was first introduce in 2.3. And I see functions as having accurate signatures as a good thing. I'm against 2), as I'm against removing functionality simply for purity's sakes. Removing functionality breaks code. So it's best reserved for critical problems like security issues. I cite the thread we just had in python-dev, subject line Deprecation policy, as an excellent discussion and summary of this topic. I'm confused... you seem to be saying that you are *against* changing the behaviour of repeat so that: repeat(x, -1) and repeat(x, times=-1) behave the same. Is that actually what you mean, or have I misunderstood? Are there any other functions in the standard library where the behaviour differs depending on whether an argument is given positionally or by keyword? -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 04:56 AM, Steven D'Aprano wrote (rearranged slightly so I could make my points in order): I'm confused... you seem to be saying that you are *against* changing the behaviour of repeat so that: repeat(x, -1) and repeat(x, times=-1) behave the same. Is that actually what you mean, or have I misunderstood? I apologize for not making myself clear. But that's part of what I meant, yes: we should preserve the existing behavior of times=-1 when passed in by position or by keyword. However, we should *also* add a deprecation warning when passing times=-1 by keyword, suggesting that they use times=None instead. The idea is that we could eventually remove the PyTuple_Size check and make times=-1 always behave like times=0. In practice it'd be okay with me if we never did, or at least not until Python 4. Are you rejecting the idea that the current behaviour is an out and out buggy, and therefore fixing these things can and should occur in a bug-fix release? While it's a bug, it's a very minor bug. As Python 3.4 release manager, my position is: Python 3.4 is in beta, so let's not change semantics for purity's sakes now. I'm -0.5 on adding times=None right now, and until we do we can't deprecate the old behavior. Are there any other functions in the standard library where the behaviour differs depending on whether an argument is given positionally or by keyword? Not that I know of. This instance seems to be purely unintentional; see my latest message on the relevant issue, where I went back and figured out why itertools.repeat behaves like this in the first place: http://bugs.python.org/issue19145#msg209440 Cheers, //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
2014-01-27 Serhiy Storchaka storch...@gmail.com: 27.01.14 12:55, Victor Stinner написав(ла): IncompleteReadError has two additionnal attributes: - partial: incomplete received bytes - expected: total number of expected bytes (n parameter of readexactly) This looks similar to http.client.IncompleteRead. Please read the original issue for more information: http://code.google.com/p/tulip/issues/detail?id=111 I mentionned http.client.IncompleRead exception there. The HTTP exception is similar but also different: - asyncio.IncompleReadError inherits from EOFError, not from HTTPException (which inherits from Exception) - asyncio.IncompleReadError.expected is the total expected size, not the remaining size Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 27/01/2014 12:56, Steven D'Aprano wrote: As it stands now, the documentation is internally contradictory. In one part of the documentation, it gives a clear indication that times is None should select the repeat forever behaviour. In another part of the documentation, it fails to mention that None is an acceptable value to select the repeat forever behaviour. None is not currently an acceptable value, ValueError is raised if you provide anything other than an int in both Python 2.7 and 3.3. That's why I'm against using it to say run forever in Python 3.5. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi, I'm surprised: marshal.dumps() doesn't raise an error if you pass an invalid version. In fact, Python 3.3 only supports versions 0, 1 and 2. If you pass 3, it will use the version 2. (Same apply for version 99.) Python 3.4 has two new versions: 3 and 4. The version 3 shares common object references, the version 4 adds short tuples and short strings (produce smaller files). It would be nice to document the differences between marshal versions. And what do you think of raising an error if the version is unknown in marshal.dumps()? I modified your benchmark to test also loads() and run the benchmark 10 times. Results: --- Python 3.3.3+ (3.3:50aa9e3ab9a4, Jan 27 2014, 16:11:26) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux dumps v0: 391.9 ms data size v0: 45582.9 kB loads v0: 616.2 ms dumps v1: 384.3 ms data size v1: 45582.9 kB loads v1: 594.0 ms dumps v2: 153.1 ms data size v2: 41395.4 kB loads v2: 549.6 ms dumps v3: 152.1 ms data size v3: 41395.4 kB loads v3: 535.9 ms dumps v4: 152.3 ms data size v4: 41395.4 kB loads v4: 549.7 ms --- And: --- Python 3.4.0b3+ (default:dbad4564cd12, Jan 27 2014, 16:09:40) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux dumps v0: 389.4 ms data size v0: 45582.9 kB loads v0: 564.8 ms dumps v1: 390.2 ms data size v1: 45582.9 kB loads v1: 545.6 ms dumps v2: 165.5 ms data size v2: 41395.4 kB loads v2: 470.9 ms dumps v3: 425.6 ms data size v3: 41395.4 kB loads v3: 528.2 ms dumps v4: 369.2 ms data size v4: 37000.9 kB loads v4: 550.2 ms --- Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 produces the smallest file. Victor 2014-01-27 Wolfgang tds...@gmail.com: Hi, I tested the latest beta from 3.4 (b3) and noticed there is a new marshal protocol version 3. The documentation is a little silent about the new features, not going into detail. I've run a performance test with the new protocol version and noticed the new version is two times slower in serialization than version 2. I tested it with a simple value tuple in a list (50 elements). Nothing special. (happens only if the tuple contains also a tuple) Copy of the test code: from time import time from marshal import dumps def genData(amount=50): for i in range(amount): yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, True) data = list(genData()) print(len(data)) t0 = time() result = dumps(data, 2) t1 = time() print(duration p2: %f % (t1-t0)) t0 = time() result = dumps(data, 3) t1 = time() print(duration p3: %f % (t1-t0)) Is the overhead for the recursion detection so high ? Note this happens only if there is a tuple in the tuple of the datalist. Regards, Wolfgang ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com import time import marshal def genData(amount=50): for i in range(amount): yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, True) data = list(genData()) def bench(text, func, *args): times = [] for run in range(5): t0 = time.perf_counter() result = func(*args) dt = time.perf_counter() - t0 times.append(dt) print(%s: %.1f ms % (text, min(times) * 1e3)) return result def bench_marshal(version, obj): data = bench(dumps v%s % version, marshal.dumps, obj, version) print(data size v%s: %.1f kB % (version, len(data) / 1024)) bench(loads v%s % version, marshal.loads, data) print() for version in range(5): bench_marshal(version, data) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
On 27 January 2014 15:35, Victor Stinner victor.stin...@gmail.com wrote: Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 produces the smallest file. Which version is used when creating pyc files? This benchmark might suggest that version 2 is the best... Paul ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
On Mon, Jan 27, 2014 at 10:42 AM, Paul Moore p.f.mo...@gmail.com wrote: On 27 January 2014 15:35, Victor Stinner victor.stin...@gmail.com wrote: Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 produces the smallest file. Which version is used when creating pyc files? This benchmark might suggest that version 2 is the best... Importlib just uses the default: http://hg.python.org/cpython/file/dbad4564cd12/Lib/importlib/_bootstrap.py#l671 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi, I tested the latest beta from 3.4 (b3) and noticed there is a new marshal protocol version 3. The documentation is a little silent about the new features, not going into detail. I've run a performance test with the new protocol version and noticed the new version is two times slower in serialization than version 2. I tested it with a simple value tuple in a list (50 elements). Nothing special. (happens only if the tuple contains also a tuple) Copy of the test code: from time import time from marshal import dumps def genData(amount=50): for i in range(amount): yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, True) data = list(genData()) print(len(data)) t0 = time() result = dumps(data, 2) t1 = time() print(duration p2: %f % (t1-t0)) t0 = time() result = dumps(data, 3) t1 = time() print(duration p3: %f % (t1-t0)) Is the overhead for the recursion detection so high ? Note this happens only if there is a tuple in the tuple of the datalist. Regards, Wolfgang ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
Am 27.01.2014 13:12, schrieb Antoine Pitrou: On Mon, 27 Jan 2014 04:01:02 -0800 Larry Hastings la...@hastings.org wrote: On 01/27/2014 01:39 AM, Antoine Pitrou wrote: On Sun, 26 Jan 2014 21:01:08 -0800 Larry Hastings la...@hastings.org wrote: On 01/26/2014 08:40 PM, Alexander Belopolsky wrote: On Sun, Jan 26, 2014 at 11:26 PM, Vajrasky Kok sky@speaklikeaking.com mailto:sky@speaklikeaking.com wrote: In case we are taking not backporting anything at all road, what is the best fix for the document? I would say no fix is needed for this doc because the signature suggests (correctly) that passing times by keyword is not supported. Where does it do that? In the [,times] spelling, which is the spelling customarily used for positional-only arguments. That's not my experience. But it's mine :-) (try help(str) or help(list)) It's also the convention we've been using for the docs. Georg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] News from asyncio
On Mon, Jan 27, 2014 at 5:21 AM, Victor Stinner victor.stin...@gmail.com wrote: - asyncio.IncompleReadError.expected is the total expected size, not the remaining size Why not be consistent with the meaning of http.client.IncompleteRead.expected? The current meaning can be recovered via len(e.partial) + e.expected. -- Devin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Thanks Victor for improving this. I also have to note, version 3 is only in the case of tuple in tuple slower. If you use a flat tuple it is faster than version 2. So I asked for this corner case and thought the recursion detection or something else has a huge cost. For pyc files, I think the highest available version is the used default. I didn't know version 4, nowhere mentioned in the docs. Also figured out, that every integer is accepted as protocol version. But was usable for tests against 3.3 and 2.7. :-) On Mon, Jan 27, 2014 at 5:02 PM, Brett Cannon br...@python.org wrote: On Mon, Jan 27, 2014 at 10:42 AM, Paul Moore p.f.mo...@gmail.com wrote: On 27 January 2014 15:35, Victor Stinner victor.stin...@gmail.com wrote: Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 produces the smallest file. Which version is used when creating pyc files? This benchmark might suggest that version 2 is the best... Importlib just uses the default: http://hg.python.org/cpython/file/dbad4564cd12/Lib/importlib/_bootstrap.py#l671 -- bye by Wolfgang ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
27.01.14 17:35, Victor Stinner написав(ла): Python 3.4 has two new versions: 3 and 4. The version 3 shares common object references, the version 4 adds short tuples and short strings (produce smaller files). Why we need two new versions added in one Python release? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings la...@hastings.org wrote: I apologize for not making myself clear. But that's part of what I meant, yes: we should preserve the existing behavior of times=-1 when passed in by position or by keyword. However, we should *also* add a deprecation warning when passing times=-1 by keyword, suggesting that they use times=None instead. The idea is that we could eventually remove the PyTuple_Size check and make times=-1 always behave like times=0. In practice it'd be okay with me if we never did, or at least not until Python 4. So we only add deprecation warning to only times=-1 via keyword or for all negative numbers to times via keyword? I mean, what about: from itertools import repeat list(repeat('a', times=-2)) Traceback (most recent call last): File stdin, line 1, in module OverflowError: Python int too large to convert to C ssize_t Deprecation warning or not? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 8:29 PM, Antoine Pitrou solip...@pitrou.net wrote: Sure, just adjust the number to fit the available memory (here, 2**29 does the trick). I get your point. But strangely enough, I can still recover from list(repeat('a', 2**29)). It only slows down my computer. I can ^Z the application then kill it later. But with list(repeat('a', times=-1)), rebooting the machine is compulsory. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 06:00 PM, Vajrasky Kok wrote: On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings la...@hastings.org wrote: I apologize for not making myself clear. But that's part of what I meant, yes: we should preserve the existing behavior of times=-1 when passed in by position or by keyword. However, we should *also* add a deprecation warning when passing times=-1 by keyword, suggesting that they use times=None instead. The idea is that we could eventually remove the PyTuple_Size check and make times=-1 always behave like times=0. In practice it'd be okay with me if we never did, or at least not until Python 4. So we only add deprecation warning to only times=-1 via keyword or for all negative numbers to times via keyword? I mean, what about: from itertools import repeat list(repeat('a', times=-2)) I should have been even *more* precise! When I said times=-1 I really meant all negative numbers. (I was trying to abbreviate it as -1, as my text was already too long and unwieldly.) I propose the logic be equivalent to this, handwaving for clarity boilerplate error handling (the real implementation would handle PyArg_ParseParseTupleAndKeywords or PyLong_ToPy_ssize_t failing): PyObject *element, times = Py_None; Py_ssize_t cnt; PyArg_ParseTupleAndKeywords(args, kwargs, O|O:repeat, kwargs, element, times); if times == Py_None cnt = -1 else cnt = PyLong_ToPy_ssize_t(times) if cnt 0 if times was passed by keyword issue DeprecationWarning, use repeat(o, times=None) to repeat indefinitely else cnt = 0 (For those of you who aren't familiar with the source: cnt is the internal variable used to set the repeat count of the iterator. If cnt is 0, the iterator repeats forever.) If in the future we actually removed the deprecated behavior, the last if block would change simply to if cnt 0 cnt = 0 //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On Mon, Jan 27, 2014 at 9:13 PM, Larry Hastings la...@hastings.org wrote: While it's a bug, it's a very minor bug. As Python 3.4 release manager, my position is: Python 3.4 is in beta, so let's not change semantics for purity's sakes now. I'm -0.5 on adding times=None right now, and until we do we can't deprecate the old behavior. I bow to your decision, Larry. So I believe the doc fix is required then. I propose these for doc fix: 1. Keeps the status quo = repeat.__doc__ 'repeat(object [,times]) - create an iterator which returns the object\nfor the specified number of times. If not specified, returns the object\nendlessly.' We don't explain the meaning of negative `times`. Well, people shouldn't repeat with negative times because statement such as, Kids, repeat the push-up negative two times more., does not make sense. 2. Explains the negative times, ignores the keyword = repeat.__doc__ 'repeat(object [,times]) - create an iterator which returns the object\nfor the specified number of times. If not specified, returns the object\nendlessly. Negative times means zero repetitions.' The signature repeat(object [,times]) suggest this function does not accept keyword as some core developers have stated. So if the user uses keyword with this function, well, it's too bad for them. 3. Explains the negative times, warns about keyword == repeat.__doc__ 'repeat(object [,times]) - create an iterator which returns the object\nfor the specified number of times. If not specified, returns the object\nendlessly. Negative times means zero repetitions. This function accepts keyword argument but the behaviour is buggy and should be avoided.' 4. Explains everything repeat.__doc__ 'repeat(object [,times]) - create an iterator which returns the object\nfor the specified number of times. If not specified, returns the object\nendlessly. Negative times means zero repetitions via positional-only arguments. -1 value for times via keyword means endless repetitions and is same as omitting times argument and other negative number for times means endless repetitions as well but with different implementation.' If you are wondering about the last statement: from itertools import repeat list(repeat('a', times=-4)) Traceback (most recent call last): File stdin, line 1, in module OverflowError: Python int too large to convert to C ssize_t a = repeat('a', times=-4) next(a) 'a' next(a) 'a' a = repeat('a', times=-1) next(a) 'a' next(a) 'a' list(repeat('a', times=-1)) ... freezes your computer ... Which one is better? Once we settle this, I can think about the doc fix for Doc/library/itertools.rst. Vajrasky ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Negative times behaviour in itertools.repeat for Python maintenance releases (2.7, 3.3 and maybe 3.4)
On 01/27/2014 06:26 PM, Vajrasky Kok wrote: So I believe the doc fix is required then. I propose the docstring should describe only supported behavior, and the docs in the manual should mention the unsupported behavior. However, I'm interested in Raymond's take, as he's the original author of itertools.repeat. If I were writing it, it might well come out like this: docstring: repeat(object [,times]) - iterator Return an iterator which yields the object for the specified number of times. If times is unspecified, yields the object forever. If times is negative, behave as if times is 0. documentation: repeat(object [,times]) - iterator Return an iterator which yields the object for the specified number of times. If times is unspecified, yields the object forever. If times is negative, behave as if times is 0. Equivalent to: def repeat(object, times=None): # repeat(10, 3) -- 10 10 10 if times is None: while True: yield object else: for i in range(times): yield object A common use for repeat is to supply a stream of constant values to map or zip: list(map(pow, range(10), repeat(2))) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] .. note: if times is specified using a keyword argument, and provided with a negative value, repeat yields the object forever. This is a bug, its use is unsupported, and this behavior may be removed in a future version of Python. //arry/ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol)
Hi there. I think you should modify your program to marshal (and load) a compiled module. This is where the optimizations in versions 3 and 4 become important. K -Original Message- From: Python-Dev [mailto:python-dev- bounces+kristjan=ccpgames@python.org] On Behalf Of Victor Stinner Sent: Monday, January 27, 2014 23:35 To: Wolfgang Cc: Python-Dev Subject: Re: [Python-Dev] Python 3.4, marshal dumps slower (version 3 protocol) Hi, I'm surprised: marshal.dumps() doesn't raise an error if you pass an invalid version. In fact, Python 3.3 only supports versions 0, 1 and 2. If you pass 3, it will use the version 2. (Same apply for version 99.) Python 3.4 has two new versions: 3 and 4. The version 3 shares common object references, the version 4 adds short tuples and short strings (produce smaller files). It would be nice to document the differences between marshal versions. And what do you think of raising an error if the version is unknown in marshal.dumps()? I modified your benchmark to test also loads() and run the benchmark 10 times. Results: --- Python 3.3.3+ (3.3:50aa9e3ab9a4, Jan 27 2014, 16:11:26) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux dumps v0: 391.9 ms data size v0: 45582.9 kB loads v0: 616.2 ms dumps v1: 384.3 ms data size v1: 45582.9 kB loads v1: 594.0 ms dumps v2: 153.1 ms data size v2: 41395.4 kB loads v2: 549.6 ms dumps v3: 152.1 ms data size v3: 41395.4 kB loads v3: 535.9 ms dumps v4: 152.3 ms data size v4: 41395.4 kB loads v4: 549.7 ms --- And: --- Python 3.4.0b3+ (default:dbad4564cd12, Jan 27 2014, 16:09:40) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux dumps v0: 389.4 ms data size v0: 45582.9 kB loads v0: 564.8 ms dumps v1: 390.2 ms data size v1: 45582.9 kB loads v1: 545.6 ms dumps v2: 165.5 ms data size v2: 41395.4 kB loads v2: 470.9 ms dumps v3: 425.6 ms data size v3: 41395.4 kB loads v3: 528.2 ms dumps v4: 369.2 ms data size v4: 37000.9 kB loads v4: 550.2 ms --- Version 2 is the fastest in Python 3.3 and 3.4, but version 4 with Python 3.4 produces the smallest file. Victor 2014-01-27 Wolfgang tds...@gmail.com: Hi, I tested the latest beta from 3.4 (b3) and noticed there is a new marshal protocol version 3. The documentation is a little silent about the new features, not going into detail. I've run a performance test with the new protocol version and noticed the new version is two times slower in serialization than version 2. I tested it with a simple value tuple in a list (50 elements). Nothing special. (happens only if the tuple contains also a tuple) Copy of the test code: from time import time from marshal import dumps def genData(amount=50): for i in range(amount): yield (i, i+2, i*2, (i+1,i+4,i,4), my string template %s % i, 1.01*i, True) data = list(genData()) print(len(data)) t0 = time() result = dumps(data, 2) t1 = time() print(duration p2: %f % (t1-t0)) t0 = time() result = dumps(data, 3) t1 = time() print(duration p3: %f % (t1-t0)) Is the overhead for the recursion detection so high ? Note this happens only if there is a tuple in the tuple of the datalist. Regards, Wolfgang ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python- dev/victor.stinner%40gm ail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com