[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-05 Thread R. David Murray

R. David Murray added the comment:

Heh, yeah.  What I was really trying to do by that comment was clarify for any 
*other* readers that stumble on this issue later it is just the python code 
that *has* to be constrained by the GIL.  I have no idea how much of the scipy 
stack drops the gil at strategic spots.  I do seem to remember that the the 
Jupyter uses multiple processes for its parallelism, though.  Anyway, this is 
pretty off topic now :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread Davin Potts

Changes by Davin Potts :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread Davin Potts

Davin Potts added the comment:

@r.david.murray:  Oh man, I was not going to go as far as advocate dropping the 
GIL.  :)

At least not in situations like this where the exploitable parallelism is meant 
to be at the Python level and not inside the Fortran code (or that was my 
understanding of the setup).  Martin had already mentioned the motivation to 
fork to avoid side effects possibly arising somewhere in that code.

In practice, after dropping the GIL the threads will likely use multiple of the 
cores -- though that's up to the OS kernel scheduler, that's what I've observed 
happening after temporarily dropping the GIL on both Windows and Linux systems. 
 

As to the benefit of CPU affinity, it depends -- it depends upon what my code 
was and what the OS and other system processes were busily doing at the time my 
code ran -- but I've never seen it hurt performance (even if the help was 
diminishingly small at times).  For certain situations, it has been worth doing.


Correction:  I have seen cpu affinity hurt performance when I make a 
bone-headed mistake and constrain too many things onto too few cores.  But 
that's a PEBCAK root cause.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread R. David Murray

R. David Murray added the comment:

To clarify the GIL issue (for davin, I guess? :): if the library you are using 
to interface with the FORTRAN code drops the GIL before calling the FORTRAN, 
then you *can* take advantage of multiple cores.  It is only the python code 
(and some of the code interacting with the python objects) that is limited to 
executing on one core at a time.  (As far as I know it isn't restricted to be 
the *same* core unless you set CPU affinity somehow, and I have no idea if it 
improves performance to use CPU affinity or not).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread Martin Ritter

Martin Ritter added the comment:

Dear Davin, 

Thanks for the input, I was perfectly aware that the "solution" I proposed is 
not realistic. But the feedback that multiprocessing is using threads 
internally is useful as I can quickly abandon the idea to do something like the 
check I proposed in our code base without spending time on it. 

I was aware of the Gil, I just did not anticipate that big a problem when 
mixing threads and processes with rather simple python code. My bad, sorry for 
the noise. 

Cheers, 

Martin

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread Davin Potts

Davin Potts added the comment:

While I believe I understand the motivation behind the suggestion to detect 
when the code is doing something potentially dangerous, I'll point out a few 
things:
* any time you ask for a layer of convenience, you must choose something to 
sacrifice to get it (usually performance is sacrificed) and this sacrifice will 
affect all code (including non-problematic code)
* behind the scenes multiprocessing itself is employing multiple threads in the 
creation and coordination between processes -- "checking to see if there are 
multiple threads active on process creation" is therefore a more complicated 
request than it maybe first appears
* Regarding "python makes it very easy to mix these two", I'd say it's nearly 
as easy to mix the two in C code -- the common pattern across different 
languages is to learn the pros+cons+gotchyas of working with processes and 
threads

I too come from the world of scientific software and the mixing of Fortran, 
C/C++, and Python (yay science and yay Fortran) so I'll make another point 
(apologies if you already knew this):
There's a lot of computationally intensive code in scientific code/applications 
and being able to perform those computations in parallel is a wonderful thing.  
I am unsure if the tests you're trying to speed up exercise compute-intensive 
functions but let's assume they do.  For reasons not described here, using the 
CPython implementation, there is a constraint on the use of threads that 
restricts them to all run on a single core of your multi-core cpu (and on only 
one cpu if you have an SMP system).  Hence spinning up threads to perform 
compute intensive tasks will likely result in no better throughput (no speedup) 
because they're all fighting over the same maxed-out core.  To spread out onto 
and take advantage of multiple cores (and multiple cpus on an SMP system) you 
will want switch to creating processes (as you say you now have).  I'd make the 
distinction that you are likely much more interested in 'parallel computing' 
than 'concurrent execution'.  Since you're already using mult
 iprocessing you might also simply use `multiprocessing.Pool`.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-04 Thread Martin Ritter

Martin Ritter added the comment:

I agree that this is error prone and can not be fixed reliably on the python 
side. However, python makes it very easy to mix these two, a user might not 
even notice it if a function he calls uses fork and thus just use a 
ThreadPoolExecutor() because it's the simplest thing to do.

What could be an nice solution in my opinion if the multiprocessing module 
could check if there are already multiple threads active on process creation 
and issue a warning if so. This warning could of course be optional but would 
make this issue more obvious.

In my case we have a large C++ code base which still includes a lot of Fortran 
77 code with common blocks all over the place (yay science). Everything is 
interfaced in python so to make sure that I do not have any side effects I run 
the some of the functions in a fork using multiprocessing.Process(). And in 
this case I just wanted to run some testing in parallel. I now switched to a 
ProcessPoolExecutor which works fine but for me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-02 Thread Raymond Hettinger

Raymond Hettinger added the comment:

FWIW, this isn't even a Python specific behavior.  It is just how threads, 
locks, and processes work (or in this case don't work).  The code is doing what 
it is told to do which happens to not be what you want (i.e. a user bug rather 
than a Python bug).

I think a FAQ entry would be a reasonable place to mention this (it comes up 
more often than one would hope).

--
assignee:  -> docs@python
components: +Documentation
nosy: +docs@python
resolution:  -> not a bug

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-07-02 Thread Davin Potts

Davin Potts added the comment:

It would be nice to find an appropriate place to document the solid general 
guidance Raymond provided; though merely mentioning it somewhere in the docs 
will not translate into it being noticed.  Not sure where to put it just yet...

Martin:  Is there a specific situation that prompted your discovering this 
behavior?  Mixing the spinning up of threads with the forking of processes 
requires appropriate planning to avoid problems and achieve desired 
performance.  If you have a thoughtful design to your code but are still 
triggering problems, can you share more of the motivation?


As a side note, this is more appropriately labeled as a 'behavior' rather than 
a 'crash' -- the Python executable does not crash in any way but merely hangs 
in an apparent lock contention.

--
nosy: +davin
type: crash -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-06-30 Thread Raymond Hettinger

Raymond Hettinger added the comment:

It is in-fact problem prone (and not just in Python).  The rule is "thread 
after you fork, not before".  Otherwise, the locks used by the thread executor 
will get duplicated across processes.  If one of those processes dies while it 
has the lock, all of the other processes using that lock will deadlock.

--
nosy: +rhettinger

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-06-30 Thread R. David Murray

R. David Murray added the comment:

Mixing multiprocessing and threading is problem prone in general.  Hopefully 
one of the multiprocessing experts can say if this is a known problem or not...

--
nosy: +devin, r.david.murray, sbt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-06-30 Thread Martin Ritter

Martin Ritter added the comment:

I attached a gdb backtrace of one of the child processes

--
Added file: http://bugs.python.org/file43589/test_threadfork_backtrace.txt

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27422] Deadlock when mixing threading and multiprocessing

2016-06-30 Thread Martin Ritter

New submission from Martin Ritter:

When creating a multiprocessing.Process in a threaded environment I get 
deadlocks waiting, I guess waiting for the lock to flush the output.

I attached a minimal example of the problem which hangs for me starting with 4 
threads.

--
files: test_threadfork.py
messages: 269593
nosy: Martin Ritter
priority: normal
severity: normal
status: open
title: Deadlock when mixing threading and multiprocessing
type: crash
versions: Python 3.5
Added file: http://bugs.python.org/file43588/test_threadfork.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com