[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread STINNER Victor


STINNER Victor  added the comment:

See bpo-34663 for posix_spawn() & vfork.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread Oscar Esteban


Oscar Esteban  added the comment:

Hi Victor,

That would be great. However, we played a bit with an alternative 
implementation of posix_spawn (one I got from one related bpo), and it didn't 
seem to make any difference in terms of memory allocation.

Then, we found out that posix_spawn uses fork by default (Linux 
implementation). So the large memory allocations still happen. One can set the 
vFork option, but that is apparently a very bad idea, as far as we read.

Is that correct?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread STINNER Victor


STINNER Victor  added the comment:

> By the way you could open an issue so that subprocess uses posix_spawn() 
> where possible.

FYI I'm working on an implementation of this ;-)

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

By the way you could open an issue so that subprocess uses posix_spawn() where 
possible.

(or you could ask to reopen issue31814, which is basically that request but for 
a different reason than yours)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread Oscar Esteban


Oscar Esteban  added the comment:

Thanks for your response.

The idea would be to enable ``subprocess.Popen`` to use an existing fork server 
in its fork_exec.

The rationale: I can start a pool of n workers very early in the execution 
flow. They will have ~350MB memory fingerprint in the beginning and they will 
be reset to that every ``maxtasksperchild``. So this is basically the amount of 
VM allocated (doubled) when forking. Pretty small.

Currently, as the fork is done from some process with all the python stack of 
the app loaded in memory (1.7GB in our case), then some additional 1.7GB of VM 
are allocated on each fork. This could be avoided if the fork was done from the 
forkserver pool.

As you mention, we have been considering such a "shell" server on top of 
asyncio, so your response just confirms our intuition.

I'll close this idea for now since I agree that any investment on this problem 
should be directed to the asyncio solution.

Please note that the idea proposed would work for Python < 3 (as opposed to 
anything based on asyncio).

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

At any rate, given the constraints you're working with (thousands of child 
processes, memory conservation issues), I suggest you abandon the idea of using 
multiprocessing and write your own subprocess-server instead.

I would suggest doing so using asyncio, which should allow you to control as 
many subprocesses as you want without spawning countless threads or 
intermediate processes:

https://docs.python.org/3/library/asyncio-subprocess.html

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-17 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

I'm not sure I understand the proposed solution. Do you mean you would replace 
this:

  Parent -> forkserver -> fork child then exec

with:

  Parent -> forkserver -> posix_spawn child?

--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-12-15 Thread 4-launchpad-kalvdans-no-ip-org


Change by 4-launchpad-kalvdans-no-ip-org :


--
nosy: +4-launchpad-kalvdans-no-ip-org

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35238] Alleviate memory reservation of fork_exec in subprocess.Popen via forkserver

2018-11-13 Thread Oscar Esteban


New submission from Oscar Esteban :

## Context

We are developers of nipype (https://github.com/nipype) which is a workflow 
engine for neuroimaging software. We are experiencing problems that gave rise 
to the addition of ``os.posix_spawn`` to Python 3.8, and particularly, this - 
https://bugs.python.org/issue20104#msg222570

Our software runs command line subprocesses that can be quite memory-hungry and 
in some cases, in the order of tens of thousands processes. Therefore, we 
frequently see the OOM killing some of the processes.

## Status

We have successfully leveraged the ``forkserver`` context (in addition to a low 
number of `maxtasksperchild`) of multiprocessing to ease the load. However, the 
fork_exec memory allocation is still problematic on systems that do not allow 
overcommitting virtual memory. Waiting for os.posix_spawn to be rolled out 
might not be an option for us, as the problem is hitting badly right now.

## Proposed solution

I'd like to page experts on Lib/multiprocessing and Lib/subprocess to give 
their opinions about the following: is it possible to write an extension to 
`multiprocessing.util.Popen` such that it has the API of `subprocess.Popen` but 
the fork happens via the forkserver?

My naive intuition is that we would need to create a new type of Process, make 
sure that it then calls os.exec*e() --possibly around here 
https://github.com/python/cpython/blob/f966e5397ed8f5c42c185223fc9b4d750a678d02/Lib/multiprocessing/popen_forkserver.py#L51--,
 and finally handle communication with the subprocess.

Please let me know if that is even possible.

--
components: Library (Lib)
messages: 329868
nosy: oesteban
priority: normal
severity: normal
status: open
title: Alleviate memory reservation of fork_exec in subprocess.Popen via 
forkserver
type: enhancement
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com