Sebastian Kreft added the comment:
Disregard the last messages, It seems to be a deadblocking due to subprocess.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
Changes by Glenn Langford glenn.langf...@gmail.com:
--
nosy: -glangford
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Sebastian Kreft added the comment:
@glangford: Is that really your recommendation, to switch to celery? Python
3.4.1 should be production quality and issues like this should be addressed.
Note that I've successfully run millions of tasks using the same method, the
only difference being that
Sebastian Kreft added the comment:
Any ideas how to debug this further?
In order to overcome this issue I have an awful workaround that tracks the
maximum running time of a successful task, and if any task has been running
more than x times that maximum I consider it defunct, and increase the
Changes by STINNER Victor victor.stin...@gmail.com:
--
nosy: -haypo
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Python-bugs-list
Glenn Langford added the comment:
Any ideas how to debug this further?
Wherever the cause of the problem might live, and to either work around it or
gain additional information, here is one idea to consider.
Do you need to submit your Futures just two at a time, and tightly loop every
15s?
Sebastian Kreft added the comment:
I'm running actually millions of tasks, so sending them all at once will
consume much more resources than needed.
The issue happens no only with 2 tasks in parallel but with higher numbers
as well.
Also your proposed solution, has the problem that when you
Glenn Langford added the comment:
Under the hood, the behaviour of as_completed is quite different. So there is
no guarantee it would behave the same.
In any event, with millions of tasks you might consider Celery (I haven't used
it myself):
http://www.celeryproject.org
--
Sebastian Kreft added the comment:
I was able to recreate the issue again, and now i have some info about the
offending futures:
State: RUNNING, Result: None, Exception: None, Waiters: 0, Cancelled: False,
Running: True, Done: False
The information does not seem very relevant. However, I can
Sebastian Kreft added the comment:
The Executor is still working (but I'm using a ThreadPoolExcutor). I can
dynamically change the number of max tasks allowed, which successfully fires
the new tasks.
After 2 days running, five tasks are in this weird state.
I will change the code as
Sebastian Kreft added the comment:
@haypo: I've reproduced the issue with both 2 and 3 processes in parallel.
@glangford: the wait is actually returning after the 15 seconds, although
nothing is reported as finished. So, it's getting stuck in the while loop.
However, I imagine that without
Glenn Langford added the comment:
the wait is actually returning after the 15 seconds, although nothing is
reported as finished...What kind of debug information from the futures would
be useful?
What is the state of the pending Futures that wait() is stuck on? (e.g. display
f.running()
Sebastian Kreft added the comment:
I'm using the Python 3.4.1 compiled from source and I'm may be hitting this
issue.
My workload is launching two subprocess in parallel, and whenever one is ready,
launches another one. In one of the runs, the whole process got stuck after
launching about 3K
STINNER Victor added the comment:
the whole process got stuck after launching about 3K subprocess
How many processes are running at the same time when the whole process is stuck?
--
___
Python tracker rep...@bugs.python.org
Glenn Langford added the comment:
My workload is launching two subprocess in parallel, and whenever one is
ready, launches another one.
Since you have timeout=15.0, wait() should return at least every 15s. Can you
determine if the wait is being repeatedly called in the while loop, and if so
Brian Quinlan added the comment:
Thanks very much for the patch Glenn!
--
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
Roundup Robot added the comment:
New changeset 0bcf23a52d55 by Brian Quinlan in branch 'default':
Issue #20319: concurrent.futures.wait() can block forever even if Futures have
completed
http://hg.python.org/cpython/rev/0bcf23a52d55
--
nosy: +python-dev
STINNER Victor added the comment:
New changeset 0bcf23a52d55 by Brian Quinlan in branch 'default':
Issue #20319: concurrent.futures.wait() can block forever even if Futures
have completed
http://hg.python.org/cpython/rev/0bcf23a52d55
Hum, the change also contains:
+Fix warning message
Brian Quinlan added the comment:
Oops, no. That was junk due to my sloppiness. I’ll fix it in a minute.
On Jan 31, 2014, at 5:03 PM, STINNER Victor rep...@bugs.python.org wrote:
STINNER Victor added the comment:
New changeset 0bcf23a52d55 by Brian Quinlan in branch 'default':
Issue
Antoine Pitrou added the comment:
Shouldn't it be fixed in 3.3 too?
--
nosy: +pitrou
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Glenn Langford added the comment:
An idea for a different possible fix - rather than cleaning up waiters in
wait() and as_completed(), could they be removed in Future.set_result() and
Futures.set_exception() ?
I'm not certain if any waiter should ever be notified twice; if not, perhaps
Changes by Brian Quinlan br...@sweetapp.com:
--
assignee: - bquinlan
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Python-bugs-list
Glenn Langford added the comment:
Revised patch; I don't think there is a need to sort the keys when waiters are
being removed since only one lock is acquired at a time. Stress tests on both
wait() and as_completed() work with this approach.
--
Added file:
Glenn Langford added the comment:
@Brian - Ah, I see now what you are referring to. The patch has changes to
_create_and_install_waiters() which should not be there. The only code that
needs to change is waiter removal as I originally suggested. I am set up with a
dev environment now and will
Glenn Langford added the comment:
This patch shows the minimal desired outcome. It is not elegant in its current
form, but does only what is necessary. Ultimately I think as_completed() should
go its own way and not lock all Futures at once (#20297).
--
Added file:
Brian Quinlan added the comment:
I'm looking at futures.patch.
I don't understand why these blocks are helpful -_create_and_install_waiters
has two call sites and both (as_completed and wait) call
_create_and_install_waiters from within an _AcquireFutures context manager:
-
Glenn Langford added the comment:
It seems more plausible that the locks around the removals are fixing the bug
but I don't see how. I'll look into it some more.
It is the locks around the waiter removals that matter; I think there are only
formatting changes elsewhere in the patch. The
Changes by Mark Dickinson dicki...@gmail.com:
--
nosy: +mark.dickinson
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Mark Dickinson added the comment:
Adding Tim Peters to the nosy, since I suspect he has a general interest in
this kind of issue. As far as I know Brian Quinlan isn't actively maintaining
concurrent.futures at the moment (Brian: please correct me if I'm wrong).
--
Changes by Mark Dickinson dicki...@gmail.com:
--
nosy: +tim.peters
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Python-bugs-list
Changes by STINNER Victor victor.stin...@gmail.com:
--
nosy: +sbt
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
Python-bugs-list
STINNER Victor added the comment:
@Glenn: Can you maybe suggest a patch fixing the issue?
--
nosy: +haypo
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
Glenn Langford added the comment:
@Victor: Would like to give a patch but I am not a core developer, and I don't
version control set up yet. The proposed fix is based on reading the
distribution source code.
--
___
Python tracker
STINNER Victor added the comment:
futures.patch: reuse _AcquireFutures context manager to protect the list of
futures against concurrent access.
Without the patch, stress_wait.py hangs. With the patch, it works correctly.
--
keywords: +patch
Added file:
Brian Quinlan added the comment:
I'm not currently working on concurrent.futures but I do look at patches and
bug reports. I'll take a look at this and Issue20297 sometime this week.
--
___
Python tracker rep...@bugs.python.org
Glenn Langford added the comment:
The same bug also exists in concurrent.futures.as_completed(). The minimal fix
suggested here also works, but the bigger fix suggested in issue #20297 is
recommended for as_completed().
--
___
Python tracker
Changes by Glenn Langford glenn.langf...@gmail.com:
--
nosy: +bquinlan
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20319
___
___
New submission from Glenn Langford:
concurrent.futures.wait() can get into a state where it blocks forever on
waiter.event.wait(), even when the underlying Futures have completed.
This is demonstrated in a stress test where a large number of wait() calls are
run in multiple threads,
38 matches
Mail list logo