[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-28 Thread Daniel Barcay


Daniel Barcay  added the comment:

Just got the drop of the python3.7 release. I can confirm that this is fixed in 
python3.7 in my workload.

Nice job! Thanks for changing the mechanism of thread-sync. I'm grateful.

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-24 Thread Thomas Moreau


Thomas Moreau  added the comment:

This behavior results from the fact that in 3.6, the result_queue is used to 
pass messages to the queue_manager_thread. This behavior has been changed in 
3.7 as we rely on a _ThreadWakeup object.

In 3.6, when the result_queue is filled with many large objects, the call to 
result_queue.put(None) will hang while the previous objects are being handled 
by the queue_manager_thread, causing a latency in the submit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-24 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

I'm not sure what happens exactly in your workload, but waiting 20 seconds when 
posting some data on an unbounded queue sounds enormous.

--
nosy: +tomMoral

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-22 Thread Daniel Barcay


Daniel Barcay  added the comment:

adding experts bquinlan and pitrou for concurrent.futures to nosy-list as per 
bug tracker directions.

--
nosy: +bquinlan, pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-22 Thread Daniel Barcay


Daniel Barcay  added the comment:

Line number was incorrect due to local edits. 

Correct line number is process.py:L464  "self._result_queue.put(None)"

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33945] concurrent.futures ProcessPoolExecutor submit() blocks on results being written

2018-06-22 Thread Daniel Barcay


New submission from Daniel Barcay :

I have tracked down the exact cause of a sizable performance issue in using 
concurrent.futures.ProcessPoolExecutors, especially visible in cases where 
large amounts of data are being copied across the result.

The line-number causing the bad behavior, and several remediation paths are 
included below. Since this affects core behavior of the module, I'm reticent to 
try out a patch myself unless someone chimes in on the approach.

---Bug Symptoms:
  ProcessPoolExecutor.submit() hangs for long periods of time 
non-deterministically (over 20 seconds in my job). See causes section below for 
exact cause. 
   This hanging makes multiprocess job submissions impossible from a real-time 
constrained main thread, where the results are large objects.

---Ideal behavior:
   submit() should not block on any results of other jobs, and non-blocking 
wake signal should be used instead of a blocking put() call.

---Bug Cause:
In ProcessPoolExecutor.submit() line 473, a wake signal is being sent to the 
management thread in the form of posting a message to the result queue, waking 
the thread if it was in recv() mode.

I'm not even sure that this wake-up is necessary, as removing it seems to work 
just fine for my use-case on OSX. However, let's presume that it is for the 
time being..

The fact that submit() blocks on the result_queue being serviced is 
unnecessary, and hinders large results from being sent back across in 
concurrent.futures.result().

---Possible remediations:

If a more fully-fledged Queue implementation were used, this signal could be 
replaced by the non-blocking version. Alternately multiprocess.Queue 
implementation could be extended to implement non-blocking put()


--- Reproduction Details
  I'm using concurrent.futures.ProcessPoolExecutor for a complicated 
data-processing use-case where the result is a large object to be sent across 
the result() channel. Create any such setup where the results are on the order 
of 50MB strings, submit 5-10 jobs at a time, and watch the time it takes to 
call submit().

--
components: Extension Modules
messages: 320257
nosy: dbarcay
priority: normal
severity: normal
status: open
title: concurrent.futures ProcessPoolExecutor submit() blocks on results being 
written
type: performance
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com