> > Your original solution would have added a strong reference back to the > pool from the iterator. At first glance, that seems like a reasonable > solution to me. Victor is worried about the "risk of new reference > cycles". But reference cycles are not a problem - we have the cyclic > GC precisely to deal with them. So I'd like to see a better > justification for rejecting that solution than "there might be > reference cycles". But in response to that, you made the iterator have > a weak reference back to the pool. That's flawed because it doesn't > prevent the pool being terminated - as you say, the deadlock is still > present.
Just to be clear: I am in favour of the strong reference, but I also understand the "danger" (leaking the pool until the pool's generation reaches the threshold and the gc runs) and that is the reason I was experimenting with the weakreference. On Tue, 11 Dec 2018 at 18:37, Paul Moore <p.f.mo...@gmail.com> wrote: > On Tue, 11 Dec 2018 at 17:50, Pablo Galindo Salgado <pablog...@gmail.com> > wrote: > > I agree that misusage of the pool should not be encouraged but in this > situation the fact that > > this code hangs: > > > > import multiprocessing > > > > for x in multiprocessing.Pool().imap(int, ["4", "3"]): > > print(x) > > > > > > is a bit worriying because although is incorrect and an abuse of the > API, users can do this easily with > > no error message other than a misterious hang. > > OK, so the first problem here (to me, at least) is that it's not > obvious *why* this code is incorrect and an abuse of the API. It takes > a reasonable amount of thinking about the problem to notice that the > Pool object isn't retained, but the iterator returned from imap is. > And when the pool is collected, the worker processes are terminated, > causing the hang, as the worker never sends a result back to the main > process. But it's not obvious to me why the pool is collected before > the imap method has completed. > > As I understand it, originally the code worked because the pool > *didn't* call terminate() when collected. Now it does, and we have a > problem. I'm not *entirely* sure why, if the pool is terminated, the > wait in the iterator doesn't terminate immediately with some sort of > "process being waited on died" error, but let's assume there are good > reasons for that (as I mentioned before, I'm not an expert in > multiprocessing, so I'm OK with assuming that the original design, > done by people who *are* experts, is sound :-)) > > Your original solution would have added a strong reference back to the > pool from the iterator. At first glance, that seems like a reasonable > solution to me. Victor is worried about the "risk of new reference > cycles". But reference cycles are not a problem - we have the cyclic > GC precisely to deal with them. So I'd like to see a better > justification for rejecting that solution than "there might be > reference cycles". But in response to that, you made the iterator have > a weak reference back to the pool. That's flawed because it doesn't > prevent the pool being terminated - as you say, the deadlock is still > present. > > > I have found this on several places and people were > > very confused because usually the interpreter throws some kind of error > indication. In my humble opinion, > > we should try to avoid hanging as a consequence of the misusage, > whatever we do. > > I agree with this. But that implies to me that we should be holding a > strong reference to the pool, > > As a (somewhat weak) analogy, consider > > for n in map(int, ["1", "2"]): > print(n) > > That won't fail if the list gets collected, because map keeps a > reference to the list. My intuition would be that the Pool().imap > example would hold a reference to the pool on essentially the same > basis. > > The more I think about this, the more I struggle to see Victor's logic > for rejecting your original solution. And I *certainly* don't see why > this issue should justify changing the whole API to require users to > explicitly manage pool lifetimes. > > Paul >
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com