On Tue, Sep 29, 2015 at 12:39 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > Attached patch is a rebased patch based on latest commit (d1b7c1ff) > for Gather node. > > - I have to reorganize the defines in execParallel.h and .c. To keep > ParallelExecutorInfo, in GatherState node, we need to include execParallel.h > in execnodes.h which was creating compilation issues as execParallel.h > also includes execnodes.h, so for now I have defined ParallelExecutorInfo > in execnodes.h and instrumentation related structures in instrument.h. > - Renamed parallel_seqscan_degree to degree_of_parallelism > - Rename Funnel to Gather > - Removed PARAM_EXEC parameter handling code, I think we can do this > separately. > > I have to work more on partial seq scan patch for rebasing it and handling > review comments for the same, so for now I am sending the first part of > patch (which included Gather node functionality and some general support > for parallel-query execution).
Thanks for the fast rebase. This patch needs a bunch of cleanup: - The formatting for the GatherState node's comment block is unlike that of surrounding comment blocks. It lacks the ------- dividers, and the indentation is not the same. Also, it refers to ParallelExecutorInfo by the type name, but the other members by structure member name. The convention is to refer to them by structure member name, so please do that. - The naming of fs_workersReady is inconsistent with the other structure members. The other members use all lower-case names, separating words with dashes, but this one uses a capital letter. The other members also don't prefix the names with anything, but this uses a "fs_" prefix which I assume is left over from when this was called FunnelState. Finally, this doesn't actually tell you when workers are ready, just whether they were launched. I suggest we rename this to "any_worker_launched". - Instead of moving the declaration of ParallelExecutorInfo, please just refer to it as "struct ParallelExecutorInfo" in execnodes.h. That way, you're not sucking these includes into all kinds of places they don't really need to be. - Let's not create a new PARALLEL_QUERY category of GUC. Instead, let's the GUC for the number of workers with under resource usage -> asynchronous behavior. - I don't believe that shm_toc *toc has any business being part of a generic PlanState node. At most, it should be part of an individual type of PlanState, like a GatherState or PartialSeqScanState. But really, I don't see why we need it there at all. It should, I think, only be needed during startup to dig out the information we need. So we should just dig that stuff out and keep pointers to whatever we actually need - in this case the ParallelExecutorInfo, I think - in the particular type of PlanState node that's at issue - here GatherState. After that we don't need a pointer to the toc any more. - I'd like to do some renaming of the new GUCs. I suggest we rename cpu_tuple_comm_cost to parallel_tuple_cost and degree_of_parallelism to max_parallel_degree. - I think that a Gather node should inherit from Plan, not Scan. A Gather node really shouldn't have a scanrelid. Now, admittedly, if the only thing under the Gather is a Partial Seq Scan, it wouldn't be totally bonkers to think of the Gather as scanning the same relation that the Partial Seq Scan is scanning. But in any more complex case, like where it's scanning a join, you're out of luck. You'll have to set scanrelid == 0, I suppose, but then, for example, ExecScanReScan is not going to work. In fact, as far as I can see, the only way nodeGather.c is actually using any of the generic scan stuff is by calling ExecInitScanTupleSlot, which is all of one line of code. ExecEndGather fetches node->ss.ss_currentRelation but then does nothing with it. So I think this is just a holdover from early version of this patch where what's now Gather and PartialSeqScan were a single node, and I think we should rip it out. - On a related note, the assertions in cost_gather() are both bogus and should be removed. Similarly with create_gather_plan(). As previously mentioned, the Gather node should not care what sort of thing is under it; I am not interested in restricting it to baserels and then undoing that later. - For the same reason, cost_gather() should refer to it's third argument as "rel" not "baserel". - Also, I think this stuff about physical tlists in create_gather_plan() is bogus. use_physical_tlist is ignorant of the possibility that the RelOptInfo passed to it might be anything other than a baserel, and I think it won't be happy if it gets a joinrel. Moreover, I think our plan here is that, at least for now, the Gather's tlist will always match the tlist of its child. If that's so, there's no point to this: it will end up with the same tlist either way. If any projection is needed, it should be done by the Gather node's child, not the Gather node itself. - Let's rename DestroyParallelSetupAndAccumStats to ExecShutdownGather. Instead of encasing the entire function in if statement, let's start with if (node->pei == NULL || node->pei->pcxt == NULL) return. - ExecParallelBufferUsageAccum should be declared to take an argument of type PlanState, not Node. Then you don't have to cast what you are passing to it, and it doesn't have to cast before calling itself. And, let's also rename it to ExecShutdownNode and move it to execProcnode.c. Having a "shutdown phase" that stops a node from asynchronously consuming additional resources could be useful for non-parallel node types - especially ForeignScan and CustomScan. And we could eventually extend this to be called in other situations, like when a Limit is filled give everything beneath it a chance to ease up. We don't have to do those bits of work right now but it seems well worth making this look like a generic facility. - Calling DestroyParallelSetupAndAccumStats from ExplainNode when we actually reach the Gather node is much too late. We should really be shutting down parallel workers at the end of the ExecutorRun phase, or certainly no later than ExecutorFinish. In fact, you have standard_ExecutorRun calling ExecParallelBufferUsageAccum() but only if queryDesc->totaltime is set. What I think you should do instead is call ExecShutdownNode a few lines earlier, before shutting down the tuple receiver, and do so unconditionally. That way, the workers are always shut down in the ExecutorRun phase, which should eliminate the need for this bit in explain.c. - The changes to postmaster.c and postgres.c consist of only additional #includes. Those can, presumably, be reverted. Other than that, hah hah, it looks pretty cool. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers