[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-04 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-03 Thread Emmanuel Arias


Change by Emmanuel Arias :


--
keywords: +patch
pull_requests: +16551
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/17038

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-02 Thread Emmanuel Arias


Change by Emmanuel Arias :


--
nosy: +eamanu

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-02 Thread Yury Selivanov


Yury Selivanov  added the comment:

> We should either remove the API (not realistic dream at least for many years) 
> or fix it.  There is no choice actually.

I don't understand. What happens if we don't await the future that 
run_in_executor returns? Does it get GCed eventually? Why is memory leaking?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-02 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

The API exists, people use it and get the memory leak.
We should either remove the API (not realistic dream at least for many years) 
or fix it.  There is no choice actually.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-01 Thread Yury Selivanov


Yury Selivanov  added the comment:

> It is a more complex solution but definitely 100% backward compatible; plus 
> the solution we can prepare people for removing the deprecated code 
> eventually.

Yeah.  Do you think it's worth it bothering with this old low-level API instead 
of making a new high-level one?  I don't, but if you do the feel free to change 
it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-01 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

I thought about returning a special subclass.
Future has too rich API: get_loop(), done(), result() etc.

What's about returning the proxy object with future instance embedded; the 
object raises DeprecationWarning for everythin except __repr__, __del__ and 
__await__, __getattr__ redirects to getattr(self._fut, name) for all other 
attributes access. 

It is a more complex solution but definitely 100% backward compatible; plus the 
solution we can prepare people for removing the deprecated code eventually.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-11-01 Thread Yury Selivanov


Yury Selivanov  added the comment:

> The change is slightly not backward compatible but

Yeah, that's my main problem with converting `loop.run_in_executor()` to a 
coroutine. When I attempted doing that I discovered that there's code that 
expects the method to return a Future, and so expects it have the `cancel()` 
method.

If we convert it to a coroutine a lot of code will break, which might be OK if 
it's really necessary. Is it though? Can we return a special Future subclass 
that complains if it's not awaited?  Would that fix the problem?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Evgeny Nizhibitsky


Evgeny Nizhibitsky  added the comment:

Oh, I see that in the initial code with leakage (it was heavy 
ThreadPoolExecutor + xgboost thing) there was an await but I must have lost it 
somewhere while reducing it to the minimal example and finished in the wrong 
direction.

Glad too see that it raised a discussion to prevent others from getting into 
this silent trap.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread STINNER Victor


STINNER Victor  added the comment:

> If we convert run_in_executor() into async function we'll get a warning about 
> unawaited coroutine even without asyncio debug mode.

That sounds like a good solution :-)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

The change is slightly not backward compatible but
1. It's pretty visible. In the worst-case instead of the memory leak people see 
a RuntimeWarning
2. We did it already for a part of API, e.g. loop.sock_read() and family

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

> Can a Task be used instead of a Future in run_in_executor()?

I don't think that the task is required here. The problem is that 
run_in_executor is a function that returns asyncio future; that is in turn a 
wrapper around concurrent future object.

If we convert run_in_executor() into async function we'll get a warning about 
unawaited coroutine even without asyncio debug mode.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

Victor answered the first :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread STINNER Victor


STINNER Victor  added the comment:

> Yuri, what should we do with the issue?

A Task emits a warning when it's not awaited. Can a Task be used instead of a 
Future in run_in_executor()?

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Andrew Svetlov


Andrew Svetlov  added the comment:

You MUST await a future returned from `loop.run_in_executor()` to avoid the 
leak.

Yuri, what should we do with the issue? I see the second similar report in the 
last half of the year.
Sure, we can add weakrefs somewhere in futures._chain_future() but I pretty 
sure that the proper fix is just enforcing `await` here, e.g.

1. rename existing run_in_executor() into a private _run_in_executor() function.

2. write async trampoline:
async def run_in_executor(self, executor, func, *args):
return await self._run_in_executor(executor, func, *args)

--
nosy:  -vstinner
resolution: not a bug -> 
stage: resolved -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread STINNER Victor


STINNER Victor  added the comment:

Well, that's a common issue when using asyncio: you forgot await.

async def main(_loop):
while True:
with futures.ThreadPoolExecutor() as pool:
_loop.run_in_executor(pool, nop)
sys.stdout.write(f'\r{get_mem():0.3f}MB')

It should be: "await _loop.run_in_executor(pool, nop)" ;-)

Sadly, PYTHONASYNCIODEBUG=1 env var doesn't complain on this bug.

See: https://docs.python.org/dev/library/asyncio-dev.html#debug-mode

--
nosy: +vstinner
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38430] Memory leak in ThreadPoolExecutor + run_in_executor

2019-10-10 Thread Evgeny Nizhibitsky


New submission from Evgeny Nizhibitsky :

I have run into a memory leak caused by using run_in_executor + 
ThreadPoolExecutor while running some stability tests with custom web services.

It was 1 MB leaked for 1k requests made for my case and I've extracted the root 
cause and converted it into minimal script with both mentioned parts + just NOP 
function to "run".

The script can easily eat up to 1 GB of memory in less then 1 minute now. It 
uses external psutil library to report the memory allocated but it can be 
easily commented out and the leak will stay anyway.

One can found that script attached + Dockerfile/Makefile for reproducibility. 
I've also reproduced it in my own conda-based 3.7 environment as well as the 
master branch of cpython.

--
components: asyncio
files: python-leak.zip
messages: 354351
nosy: Evgeny Nizhibitsky, asvetlov, yselivanov
priority: normal
severity: normal
status: open
title: Memory leak in ThreadPoolExecutor + run_in_executor
type: resource usage
versions: Python 3.6, Python 3.7, Python 3.9
Added file: https://bugs.python.org/file48654/python-leak.zip

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com