Re: [HACKERS] Assorted leaks and weirdness in parallel execution
On Thu, Aug 31, 2017 at 2:13 PM, Tom Lane wrote: > Yeah, it is different. What I'm looking at is that nodeGather does > DestroyTupleQueueReader as soon as it's seen EOF on a given tuple queue. > That can't save any worker cycles. The reason seems to be that it wants > to collapse its array of TupleQueueReader pointers so only live queues are > in it. That's reasonable, but I'm inclined to implement it by making the > Gather node keep a separate working array of pointers to only the live > TupleQueueReaders. The ParallelExecutorInfo would keep the authoritative > array of all TupleQueueReaders that have been created, and destroy them in > ExecParallelFinish. Hmm, that's a thought. > Your point is that we want to shut down the TupleQueueReaders immediately > on rescan, which we do already. Another possible scenario is to shut them > down once we've reached the passed-down tuple limit (across the whole > Gather, not per-child which is what 3452dc524 implemented). I don't think > what I'm suggesting would complicate that. Yeah. I think the way to do that would be to implement what is mentioned in the comment for ExecShutdownNode: call that function on the child plan as soon as the LIMIT is filled. (Hmm, the reference to someday covering FDW in the header of that comment is obsolete, isn't it? Another oversight on my part.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assorted leaks and weirdness in parallel execution
Robert Haas writes: > On Thu, Aug 31, 2017 at 11:09 AM, Tom Lane wrote: >> (With this patch, >> there are no callers of shm_mq_get_queue(); should we remove that?) > May as well. I can't remember any more why I did shm_mq_detach() that > way; I think there was someplace where I thought that the > shm_mq_handle might not be available. Maybe I'm misremembering, or > perhaps the situation has changed as that code has evolved. I initially tried to convert the on_dsm_detach callback to take a pointer to the shm_mq_handle rather than the shm_mq proper. That caused regression test crashes in some processes, indicating that there are situations where we have freed the shm_mq_handle before the DSM detach happens. I think it was only during worker process exit. That's sort of contrary to the advice in shm_mq.c about the desirable lifespan of a shm_mq_handle, but I didn't feel like trying to fix it. It seems generally more robust if the on_dsm_detach callback assumes as little as possible about intra-process state, anyway. I don't have any strong reason to remove shm_mq_get_queue(), other than neatnik-ism. It might save a caller having to remember the shm_mq pointer separately. Given the set of API functions, that would only matter if somebody wanted to set/get the sender/receiver PGPROC pointers later, but maybe that's a plausible thing to do. >> It seems like a significant modularity violation that execParallel.c >> is responsible for creating those shm_mqs but not for cleaning them up. > Yeah, the correct division of labor between execParallel.c and > nodeGather.c was not entirely clear to me, and I don't pretend that I > got that 100% right. OK, I'll have a go at that. >> (That would make it more difficult to do the early reader destruction >> that nodeGather currently does, but I am not sure we care about that.) > I think the only thing that matters here is -- if we know that we're > not going to read any more tuples from a worker that might still be > generating tuples, it's imperative that we shut it down ASAP. > Otherwise, it's just going to keep cranking them out, wasting > resources unnecessarily. I think this is different than what you're > talking about here, but just wanted to be clear. Yeah, it is different. What I'm looking at is that nodeGather does DestroyTupleQueueReader as soon as it's seen EOF on a given tuple queue. That can't save any worker cycles. The reason seems to be that it wants to collapse its array of TupleQueueReader pointers so only live queues are in it. That's reasonable, but I'm inclined to implement it by making the Gather node keep a separate working array of pointers to only the live TupleQueueReaders. The ParallelExecutorInfo would keep the authoritative array of all TupleQueueReaders that have been created, and destroy them in ExecParallelFinish. Your point is that we want to shut down the TupleQueueReaders immediately on rescan, which we do already. Another possible scenario is to shut them down once we've reached the passed-down tuple limit (across the whole Gather, not per-child which is what 3452dc524 implemented). I don't think what I'm suggesting would complicate that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assorted leaks and weirdness in parallel execution
On Thu, Aug 31, 2017 at 11:09 AM, Tom Lane wrote: > I complained a couple weeks ago that nodeGatherMerge looked like it > leaked a lot of memory when commanded to rescan. Attached are three > proposed patches that, in combination, demonstrably result in zero > leakage across repeated rescans. Gosh, thanks for looking into this so deeply. I apologize for all of the bugs. Also, my ego is taking some severe damage here. > But it's going to crash and burn someday. Yeah, ouch. > (With this patch, > there are no callers of shm_mq_get_queue(); should we remove that?) May as well. I can't remember any more why I did shm_mq_detach() that way; I think there was someplace where I thought that the shm_mq_handle might not be available. Maybe I'm misremembering, or perhaps the situation has changed as that code has evolved. > The last patch fixes the one remaining leak I saw after applying the > first two patches, namely that execParallel.c leaks the array palloc'd > by ExecParallelSetupTupleQueues --- just the array storage, not any of > the shm_mq_handles it points to. The given patch just adds a pfree > to ExecParallelFinish, but TBH I find this pretty unsatisfactory. > It seems like a significant modularity violation that execParallel.c > is responsible for creating those shm_mqs but not for cleaning them up. Yeah, the correct division of labor between execParallel.c and nodeGather.c was not entirely clear to me, and I don't pretend that I got that 100% right. > (That would make it more difficult to do the early reader destruction > that nodeGather currently does, but I am not sure we care about that.) I think the only thing that matters here is -- if we know that we're not going to read any more tuples from a worker that might still be generating tuples, it's imperative that we shut it down ASAP. Otherwise, it's just going to keep cranking them out, wasting resources unnecessarily. I think this is different than what you're talking about here, but just wanted to be clear. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers