[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-10-05 Thread STINNER Victor

STINNER Victor added the comment:

 Please commit your change to the tulip repo too.

Oh sorry. In fact, I made the commit but I forgot to push my change :-p

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread STINNER Victor

STINNER Victor added the comment:

Sorry, I'm coming later after the review and the commit, but I worry about 
performances of _run_once() since it's the core of asyncio. Yury proposed to 
only iterate once on self._scheduled when removing delayed calls, and I have 
the same concern.

Here is a patch which change _run_once() to only iterate once.

IMO the change is obvious, the current iterates twice and makes the same check 
twice (check the _cancelled attribute of handles).

--
keywords: +patch
resolution: fixed - 
status: closed - open
Added file: http://bugs.python.org/file36758/asyncio_remove_delayed.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread STINNER Victor

STINNER Victor added the comment:

 Turned out that single loop approach is not faster than loop+comprehension

IMO it makes the code simpler and easier to understand.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

 IMO it makes the code simpler and easier to understand.

But it's a tad slower, like 2-3% ;) You can test it yourself, we only tested it 
on huge tasks list of 1M items.

FWIW, I'm not opposed to your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

Victor,

During the code review we tried the single loop approach. At the end Joshua 
wrote a small benchmark to test if it's really faster to do it in one loop or 
not. Turned out that single loop approach is not faster than loop+comprehension 
(but it's not much slower either, I'd say they are about the same in terms of 
speed).

One loop approach might work faster on PyPy, but unfortunately, they still 
don't support 3.3 to test.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

Victor,

I've done some additional testing. Here's a test that Joshua wrote for the code 
review: https://gist.github.com/1st1/b38ac6785cb01a679722

It appears that single loop approach works a bit faster for smaller collections 
of tasks. On a list of 1 tasks it's on average faster 2-3%, on a list of 
100 tasks it's slower for 2-3%.

I'm not sure what's the average number of tasks for an average asyncio 
application, but something tells me it's not going to be in range of millions. 
I think you can fix the code to have a single loop.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread STINNER Victor

STINNER Victor added the comment:

... on average faster 2-3% ... slower for 2-3%

3% is meaningless. On a microbenchmark, you can say that it's faster or slower 
if the difference is at least 10%.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Roundup Robot

Roundup Robot added the comment:

New changeset b85ed8bb7523 by Victor Stinner in branch '3.4':
Issue #22448: asyncio, cleanup _run_once(), only iterate once to remove delayed
https://hg.python.org/cpython/rev/b85ed8bb7523

New changeset 8e9df3414185 by Victor Stinner in branch 'default':
(Merge 3.4) Issue #22448: asyncio, cleanup _run_once(), only iterate once to
https://hg.python.org/cpython/rev/8e9df3414185

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

Victor,

Here's an updated benchmark results:

NUMBER_OF_TASKS 1
ITERATIONS - 2000 out of 2000
2 loops:  0.004267875499863294
1 loop:   0.007916624497738667
TOTAL_BENCH_TIME 15.975227117538452

NUMBER_OF_TASKS 10
ITERATIONS - 2000 out of 2000
2 loops:  0.04882345050100412
1 loop:   0.09141454550081107
TOTAL_BENCH_TIME 180.64003109931946

2 loops is always about 30-40% slower.  I've updated the benchmark I used: 
https://gist.github.com/1st1/b38ac6785cb01a679722 

Now it incorporates a call to heapify, and should yield more stable results. 
Please check it out, as I'm maybe doing something wrong there, but if it's 
alright, I think that you need to revert your commits.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

typo:

 2 loops is always about 30-40% slower.

2 loops is always about 30-40% faster.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread STINNER Victor

STINNER Victor added the comment:

Hum you don't reset start between the two benchmarks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Yury Selivanov added the comment:

Eh, I knew something was wrong. Thanks.

   NUMBER_OF_TASKS 10
   ITERATIONS - 2000 out of 2000
   2 loops:  0.045037232999675325
   1 loop:  0.045182990999819594
   TOTAL_BENCH_TIME 91.36706805229187

Please commit your change to the tulip repo too.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-30 Thread Yury Selivanov

Changes by Yury Selivanov yseliva...@gmail.com:


--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-25 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 2a868c9f8f15 by Yury Selivanov in branch '3.4':
asyncio: Improve canceled timer handles cleanup. Closes issue #22448.
https://hg.python.org/cpython/rev/2a868c9f8f15

New changeset a6aaacb2b807 by Yury Selivanov in branch 'default':
asyncio: Improve canceled timer handles cleanup. Closes issue #22448.
https://hg.python.org/cpython/rev/a6aaacb2b807

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-25 Thread Joshua Moore-Oliva

Joshua Moore-Oliva added the comment:

 Hm. That sounds like you won't actually be interoperable with other 
 asyncio-using code.

asyncio code can be interoperated with by spinning off an asyncio coroutine 
that on completion calls a callback that reschedules a non-asyncio coroutine.

I assume we shouldn't be spamming an issue with unrelated chatter, I'd be happy 
to discuss more via email if you would like.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-25 Thread Joshua Moore-Oliva

Joshua Moore-Oliva added the comment:

Also - should I close this issue now that a patch has been committed?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-25 Thread Yury Selivanov

Yury Selivanov added the comment:

Hm, strange, usually roundup robot closes issues. Anyways, closed now. Thanks 
again, Joshua.

--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Joshua Moore-Oliva

Joshua Moore-Oliva added the comment:

 You can contribute upstream to the Tulip project first.

Will I be writing a patch and tests for tulip, and then separate a patch and 
tests for python 3.4?  Or will I submit to tulip, and then the changes will get 
merged from tulip into python by some other process? 

If possible, I would like to get this into python 3.4.2 (assuming all goes 
well).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Guido van Rossum

Guido van Rossum added the comment:

We can merge the changes into 3.4 and 3.5 for you, it's just a simple copy
(the codebases are identical). However, the 3.4.2 release candidate is
apparently in 2 days, so I think you've missed that train already.

On Sat, Sep 20, 2014 at 9:02 AM, Joshua Moore-Oliva rep...@bugs.python.org
wrote:


 Joshua Moore-Oliva added the comment:

  You can contribute upstream to the Tulip project first.

 Will I be writing a patch and tests for tulip, and then separate a patch
 and tests for python 3.4?  Or will I submit to tulip, and then the changes
 will get merged from tulip into python by some other process?

 If possible, I would like to get this into python 3.4.2 (assuming all goes
 well).

 --

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue22448
 ___


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Guido van Rossum

Guido van Rossum added the comment:

By the way I just looked at wait_for.py; it has a bug where do_work() isn't 
using yield-from with the sleep() call. But that may well be the issue you were 
trying to debug, and this does not change my opinion about the issue -- I am 
still looking forward to your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Joshua Moore-Oliva

Joshua Moore-Oliva added the comment:

My patch is ready for review, if I followed the process correctly I think you 
should have received an email

https://codereview.appspot.com/145220043

 By the way I just looked at wait_for.py; it has a bug where do_work() isn't 
 using yield-from with the sleep() call. But that may well be the issue you 
 were trying to debug, and this does not change my opinion about the issue

That was not intended, it was just a mistake.

(A quick aside on yield from, feel free to ignore, I don't expect to change 
anyone's opinion on this)
I don't use yield from much - my first use of asyncio was porting an 
application from gevent (I made a small custom wrapper with fibers 
(https://pypi.python.org/pypi/fibers) that can internally yield on coroutines). 
 I have read https://glyph.twistedmatrix.com/2014/02/unyielding.html but in my 
cases, I tend to write my code with the thought that any non standard library 
function can yield (I initially tried porting to vanilla asyncio, but I ended 
up having yield from almost everywhere). In the rare cases I want to ensure no 
yielding takes place across function calls, I like the way gruvi 
(https://github.com/geertj/gruvi) handles it with a construct to assert no 
yielding takes place.

with assert_no_switchpoints():
do_something()
do_something_else()

I also find that it is less error prone (missing a yield from), but that is a 
minor point as I could write a static analyzer (on top of test cases ofc) to 
check for that.

But that's just my opinion and opinion's evolve :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Guido van Rossum

Guido van Rossum added the comment:

I will try to review later tonight. One thing though:

  I tend to write my code with the thought that any non standard library
function can yield

That makes sense when using geven, but not when using asyncio or Trollius.
Nothing will make events run if you don't use yield [from].

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Joshua Moore-Oliva

Joshua Moore-Oliva added the comment:

 I will try to review later tonight.

Thanks!

 That makes sense when using gevent, but not when using asyncio or Trollius. 
 Nothing will make events run if you don't use yield [from].

Yes, I am aware of that. I have written a small custom library using fibers (a 
greenlet-like library) on top of asyncio so that I don't need to use yield from 
in my application(s).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-20 Thread Guido van Rossum

Guido van Rossum added the comment:

On Sat, Sep 20, 2014 at 3:38 PM, Joshua Moore-Oliva rep...@bugs.python.org
wrote:


 Joshua Moore-Oliva added the comment:

  I will try to review later tonight.

 Thanks!

  That makes sense when using gevent, but not when using asyncio or
 Trollius. Nothing will make events run if you don't use yield [from].

 Yes, I am aware of that. I have written a small custom library using
 fibers (a greenlet-like library) on top of asyncio so that I don't need to
 use yield from in my application(s).


Hm. That sounds like you won't actually be interoperable with other
asyncio-using code.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-19 Thread Joshua Moore-Oliva

New submission from Joshua Moore-Oliva:

The core issue stems from the implementation of Timer cancellation. (which 
features like asyncio.wait_for build upon). BaseEventLoop stores scheduled 
events in an array backed heapq named _scheduled.  Once an event has been 
scheduled with call_at, cancelling the event only marks the event as cancelled, 
it does not remove it from the array backed heap.  It is only removed once the 
cancelled event is at the next scheduled event for the loop.

In a system where many events are run (and then cancelled) that may have long 
timeout periods, and there always exists at least one event that is scheduled 
for an earlier time, memory use is practically unbounded.  The attached program 
wait_for.py demonstrates a trivial example where memory use is practically 
unbounded for an hour of time. This is the case even though the program only 
ever has two uncancelled events and two coroutines at any given time in its 
execution.

This could be fixed in a variety of ways:

a) Timer cancellation could result in the object being removed from the heap 
like in the sched module.  This would be at least O(N) where N is the number of 
scheduled events.
b) Timer cancellation could trigger a callback that tracks the number of 
cancelled events in the _scheduled list. Once this number exceeds a threshold ( 
50% ? ), the list could be cleared of all cancelled events and then be 
re-heapified. 
c) A balanced tree structure could be used to implement the scheduled events 
O(log N) time complexity (current module is O(log N) for heappop anyways).

Given python's lack of a balanced tree structure in the standard library, I 
assume option c) is a non-starter.

I would prefer option b) over option a) as when there are a lot of scheduled 
events in the system (upwards of 50,000 - 100,000 in some of my use cases) the 
amortized complexity for cancelling an event trends towards O(1) (N/2 
cancellations are handled by a single O(N) event) at the cost of slightly more, 
but bounded relative to the amount of events, memory.

I would be willing to take a shot at implementing this patch with the most 
agreeable option. Please let me know if that would be appreciated, or if 
someone else would rather tackle this issue. (First time bug report for python, 
not sure of the politics/protocols involved). 

Disclaimer that I by no means an asyncio expert, my understanding of the code 
base is based on my reading of it debugging this memory leak.

--
components: asyncio
files: wait_for.py
messages: 227136
nosy: chatgris, gvanrossum, haypo, yselivanov
priority: normal
severity: normal
status: open
title: call_at/call_later with Timer cancellation can result in (practically) 
unbounded memory usage.
type: resource usage
versions: Python 3.4
Added file: http://bugs.python.org/file3/wait_for.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22448] call_at/call_later with Timer cancellation can result in (practically) unbounded memory usage.

2014-09-19 Thread Guido van Rossum

Guido van Rossum added the comment:

Hi Joshua,

This is indeed a problem -- I just never expected that you would be having that 
many events and and canceling the majority.  I am sorry you had to debug this. 
:-(

This was anticipated by the author of pyftpdlib (Giampaolo Roloda'), who 
proposed an elegant solution: keep track of the number of cancelled events, and 
when the number gets too high (according to some measure) the heapq is simply 
rebuilt by filtering out cancelled events.  I think this is similar to your 
proposal (b).

I would love it if you could implement this!  Just make sure to add some tests 
and follow the PEP 8 style guide.  You can contribute upstream to the Tulip 
project first.  https://code.google.com/p/tulip/

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22448
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com