On Mon, Nov 16, 2015 at 10:10 AM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> This idea will solve my concern gracefully.
> The new partial_pathlist keeps candidate of path-nodes to be gathered
> in this level or upper. Unlike path-nodes in the pathlist already, we
> don't need to rip off GatherPath later.

Cool, yes.

> Can we expect any path-nodes in the partial_pathlist don't contain
> underlying GatherPath even if and when we would apply this design on
> joinrel also?

Yes.  A path that's already been gathered is not partial any more.
However, to create a partial path for a joinrel, we must join a
partial path to a complete path.  The complete path mustn't be one
which internally contains a Gather.   This is where we need another
per-path flag, I think.

> I'd like to agree with this idea. If Append can handle restricted_plans
> concurrently with safe_plans and parallel_plans, we don't need to give
> up parallelism even if any of child relation has neither safe- nor
> parallel-plans.


> One thing we need to pay attention is, we have to inform Gather node
> to kick local sub-plans if underlying Append node has any restricted
> plans. It also needs to distinguish the case when Gather node cannot
> launch any background workers, because the first case runs only type-C
> but the second case has to run all the sub-plans in local context.

I don't think that Gather needs to know anything about what's under
the Append.  What I think we want is that when we execute the Append:

(1) If we're the leader or not in parallel mode, run restricted plans,
then parallel plans, then safe plans.
(2) If we're a worker, run safe plans, then parallel plans.
(3) Either way, never run a safe plan if the leader or some other
worker has already begun to execute it.

The reason to have the leader prefer parallel plans to safe plans is
that it is more likely to become a bottleneck than the workers.  Thus
it should prefer to do work which can be split up rather than claiming
a whole plan for itself.  But in the case of restricted plans it has
no choice, since no one else can execute those, and it should do them
first, since they may be the limiting factor in finishing the whole

>> Incidentally, I think it's subtly wrong to think of the parallel_aware
>> flag as telling you whether the plan can absorb multiple workers.
>> That's not really what it's for.  It's to tell you whether the plan is
>> doing *something* parallel aware - that is, whether its Estimate,
>> InitializeDSM, and InitializeWorker callbacks should do anything.  For
>> SeqScan, flipping parallel_aware actually does split the input among
>> all the workers, but for Append it's probably just load balances and
>> for other nodes it might be something else again.  The term I'm using
>> to indicate a path/plan that returns only a subset of the results in
>> each worker is "partial".
> Therefore, a NestLoop that takes underlying ParallelSeqScan and IndexScan
> may not be parallel aware by itself, however, it is exactly partial.


> This NestLoop will has parallel_degree likely larger than "1", won't it?

Larger than 0.

> It seems to me the "partial" is more clear concept to introduce how sub-
> plan will perform.


>> Whether or not a path is partial is, in the
>> design embodied in this patch, indicated both by whether
>> path->parallel_degree > 0 and whether the path is in rel->pathlist or
>> rel->partial_pathlist.
> We should have Assert to detect paths with parallel_degree==0 but in
> the rel->partial_pathlist or parallel_degree > 1 but not appear in
> the rel->partial_pathlist?

parallel_degree==0 in the partial_pathlist is bad, but
parallel_degree>0 in the regular pathlist is OK, at least if it's a
Gather node.

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to