Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-07-01 Thread Robert Haas
On Fri, Jul 1, 2016 at 10:52 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane  wrote:
>>> And the point of that is what, exactly?  If the only change is that
>>> "some restrictions get enforced", I am not clear on why we need such
>>> a test mode in cases where the planner is afraid to put a top Gather on
>>> the plan.  In particular, given the coding as you now have it, it seems
>>> like the only case where there's any difference is where we set
>>> glob->parallelModeOK but nonetheless end up with a not-parallel-safe
>>> topmost path (that doesn't have a Gather within it).  It's not clear
>>> to me why having the executor switch into parallel mode makes sense at
>>> all with such a plan.
>
>> Suppose you create a PL/pgsql function that does an UPDATE and mark it
>> PARALLEL RESTRICTED.  You wonder whether you've marked it correctly.
>> You can set force_parallel_mode=on and SELECT myfunc().  The
>> subsequent ERROR tells you that you've mismarked it.
>
> Right, but that statement is still true with the logic I'm imagining.
> I would also argue that the existing text in config.sgml explaining
> what this parameter does corresponds much more nearly to what I'm
> suggesting than to what you say the semantics are.

I just went and reread that description and it looks to me like it
matches what I said.  I guess I don't really understand what exactly
you want to change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-07-01 Thread Robert Haas
On Fri, Jul 1, 2016 at 11:09 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane  wrote:
>>> Don't have time to re-read this right now, but maybe tomorrow or
>>> Saturday.
>
>> OK, thanks.
>
> There's still the extra-word problem here:
>
> +* If the input rel is marked consider_parallel and there's nothing
> +* that's not parallel-safe in the LIMIT clause, then the final_rel is
> +* can be marked consider_parallel as well.
>
> Other than that, and the quibble over initialization of
> parallelModeNeeded, I'm good with this.

OK, committed.  I think we can argue about parallelModeNeeded as a
separate matter.  That's merely a sideshow as far as this patch is
concerned.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-07-01 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane  wrote:
>> Don't have time to re-read this right now, but maybe tomorrow or
>> Saturday.

> OK, thanks.

There's still the extra-word problem here:

+* If the input rel is marked consider_parallel and there's nothing
+* that's not parallel-safe in the LIMIT clause, then the final_rel is
+* can be marked consider_parallel as well.

Other than that, and the quibble over initialization of
parallelModeNeeded, I'm good with this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-07-01 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane  wrote:
>> And the point of that is what, exactly?  If the only change is that
>> "some restrictions get enforced", I am not clear on why we need such
>> a test mode in cases where the planner is afraid to put a top Gather on
>> the plan.  In particular, given the coding as you now have it, it seems
>> like the only case where there's any difference is where we set
>> glob->parallelModeOK but nonetheless end up with a not-parallel-safe
>> topmost path (that doesn't have a Gather within it).  It's not clear
>> to me why having the executor switch into parallel mode makes sense at
>> all with such a plan.

> Suppose you create a PL/pgsql function that does an UPDATE and mark it
> PARALLEL RESTRICTED.  You wonder whether you've marked it correctly.
> You can set force_parallel_mode=on and SELECT myfunc().  The
> subsequent ERROR tells you that you've mismarked it.

Right, but that statement is still true with the logic I'm imagining.
I would also argue that the existing text in config.sgml explaining
what this parameter does corresponds much more nearly to what I'm
suggesting than to what you say the semantics are.

>> What I'm not happy about is that as you've got things constituted,
>> the GetForeignUpperPaths hook is broken so far as injecting parallel paths
>> is concerned, because the consider_parallel flags for the upperrels
>> aren't set yet when it gets called.  If we keep consider_parallel with
>> its present usage, we're going to have to refactor things to fix that.

> I see.  It seems to me, and I may be failing to understand something,
> that the placement of create_upper_paths_hook is substantially better
> than the placement of GetForeignUpperPaths.  If the latter were moved
> to where the former now is, then consider_parallel would be set
> sufficiently early and everything would be fine.

Yeah, I came to more or less the same conclusion last night.  Will see
to it after you commit this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Robert Haas
On Thu, Jun 30, 2016 at 5:54 PM, Tom Lane  wrote:
>> That's pretty much right, but there are two conceptually separate
>> things.  The first is whether or not we actually attempt to spawn
>> workers, and the second is whether or not we enter parallel mode.
>> Entering parallel mode enables various restrictions that make spawning
>> workers safe, so we cannot spawn workers unless we enter parallel
>> mode.  We can, however, enter parallel mode without spawning any
>> workers, and force_parallel_mode=on does so.  The point of that is
>> that even if the plan doesn't end up being run inside of a worker, it
>> will be run with most of the same restrictions on what it can do that
>> would be in place if a truly parallel plan were chosen.
>
> And the point of that is what, exactly?  If the only change is that
> "some restrictions get enforced", I am not clear on why we need such
> a test mode in cases where the planner is afraid to put a top Gather on
> the plan.  In particular, given the coding as you now have it, it seems
> like the only case where there's any difference is where we set
> glob->parallelModeOK but nonetheless end up with a not-parallel-safe
> topmost path (that doesn't have a Gather within it).  It's not clear
> to me why having the executor switch into parallel mode makes sense at
> all with such a plan.

Suppose you create a PL/pgsql function that does an UPDATE and mark it
PARALLEL RESTRICTED.  You wonder whether you've marked it correctly.
You can set force_parallel_mode=on and SELECT myfunc().  The
subsequent ERROR tells you that you've mismarked it.

>>> * I'm still not real happy with the has_parallel_hazard test added to
>>> apply_projection_to_path, and here is why not: if the target path is
>>> not projection-capable, we'll never get to that test.  We just pass
>>> the problem off to create_projection_path, which looks no further
>>> than rel->consider_parallel.  It seems like we have to add a
>>> has_parallel_hazard test there as well, which is a tad annoying,
>>> not least because all the direct callers of create_projection_path
>>> seem to have made the test already.
>
>> Thanks for noticing that problem; I've fixed it by inserting a test
>> into create_projection_path.  As far as the annoyance factor is
>> concerned, you obviously know that there was a flag there to reduce
>> the cost of that, but you insisted on removing it.  Maybe you should
>> consider putting it back.
>
> No, that flag was on apply_projection_to_path, and I didn't care for it
> because most of the callers didn't appear to want to go to the trouble of
> setting it correctly.  Adding such an argument to create_projection_path
> as well doesn't seem to make that better.

I'm open to suggestions.  As I've noted a few times already, though
maybe less clearly than I should have done, I think it's quite likely
to be a good idea to try to avoid the overhead of running
has_parallel_hazard repeatedly on the same tlists, or for that matter,
running it on tlists at all.  I don't have any evidence that's
expensive enough to cost, just a hunch.  Exactly how to do that best,
I'm not sure.

>> Well, the point of consider_parallel is that there are some things
>> that are specific to the individual path, and there are some that are
>> properties of the RelOptInfo.  It seems highly desirable to check
>> things that fall into the latter category exactly once, rather than
>> once per path.  You seem to be fighting hard against that idea, and
>> I'm pretty baffled as to why.
>
> What I'm not happy about is that as you've got things constituted,
> the GetForeignUpperPaths hook is broken so far as injecting parallel paths
> is concerned, because the consider_parallel flags for the upperrels
> aren't set yet when it gets called.  If we keep consider_parallel with
> its present usage, we're going to have to refactor things to fix that.

I see.  It seems to me, and I may be failing to understand something,
that the placement of create_upper_paths_hook is substantially better
than the placement of GetForeignUpperPaths.  If the latter were moved
to where the former now is, then consider_parallel would be set
sufficiently early and everything would be fine.  We could
alternatively fish out the code to set consider_parallel for the upper
rels and do all of that work before calling that hook.  That's a bit
hairy, because we'd basically go set all of the consider_parallel
flags, then call that hook, then circle back around for the core path
generation, but I don't see any intrinsic obstacle to that line of
attack.  I'm not very sure that one call for all upper rels is going
to be convenient, though.

>> Updated patch attached.   (Official status update: I'll commit this,
>> possibly over the long weekend or in any event no later than Tuesday,
>> unless there are further review comments before then, in which case I
>> will post a further official status update no later than Tuesday.)
>
> Don't have time to re-read this 

Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Robert Haas
On Thu, Jun 30, 2016 at 1:13 PM, Tom Lane  wrote:
> BTW, I just had another thought about reducing the cost of
> has_parallel_hazard checks, to wit: you already made one pass over the
> entire query to verify that there's no PARALLEL UNSAFE functions anywhere.
> If that pass were to also track whether there are any PARALLEL RESTRICTED
> functions anywhere, then in very many common cases, subsequent tests on
> portions of the query would not have to do anything, because we'd already
> know there was nothing to worry about.

Yeah, that's true.  I'm not sure how much those has_parallel_hazard()
checks are costing us.  The ones on quals seem like they are probably
pretty cheap, because if you've got enough quals for the cycles we
spend checking them to matter, it's the proliferation of paths and
RelOptInfos that is going to kill you, not the cost of the
has_parallel_hazard() checks.  I think, anyway.  The ones on target
lists, which I didn't foresee, seem more troubling, because you could
easily be selecting a large number of columns and so we end up with
lots of RelOptInfos that all have long target lists and we've got to
keep checking those target lists over and over again.  I'd like to
find a way to do better there.  I still think that it might be better
to optimize the tlist-is-all-vars case instead of doing what you
propose here, but I'm not sure, and your intuition may well be better
than mine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Tom Lane
Robert Haas  writes:
> On Thu, Jun 30, 2016 at 12:04 PM, Tom Lane  wrote:
>> * It seems like the initialization of root->parallelModeNeeded is still
>> overcomplicated and/or underexplained.

> That's pretty much right, but there are two conceptually separate
> things.  The first is whether or not we actually attempt to spawn
> workers, and the second is whether or not we enter parallel mode.
> Entering parallel mode enables various restrictions that make spawning
> workers safe, so we cannot spawn workers unless we enter parallel
> mode.  We can, however, enter parallel mode without spawning any
> workers, and force_parallel_mode=on does so.  The point of that is
> that even if the plan doesn't end up being run inside of a worker, it
> will be run with most of the same restrictions on what it can do that
> would be in place if a truly parallel plan were chosen.

And the point of that is what, exactly?  If the only change is that
"some restrictions get enforced", I am not clear on why we need such
a test mode in cases where the planner is afraid to put a top Gather on
the plan.  In particular, given the coding as you now have it, it seems
like the only case where there's any difference is where we set
glob->parallelModeOK but nonetheless end up with a not-parallel-safe
topmost path (that doesn't have a Gather within it).  It's not clear
to me why having the executor switch into parallel mode makes sense at
all with such a plan.

>> * I'm still not real happy with the has_parallel_hazard test added to
>> apply_projection_to_path, and here is why not: if the target path is
>> not projection-capable, we'll never get to that test.  We just pass
>> the problem off to create_projection_path, which looks no further
>> than rel->consider_parallel.  It seems like we have to add a
>> has_parallel_hazard test there as well, which is a tad annoying,
>> not least because all the direct callers of create_projection_path
>> seem to have made the test already.

> Thanks for noticing that problem; I've fixed it by inserting a test
> into create_projection_path.  As far as the annoyance factor is
> concerned, you obviously know that there was a flag there to reduce
> the cost of that, but you insisted on removing it.  Maybe you should
> consider putting it back.

No, that flag was on apply_projection_to_path, and I didn't care for it
because most of the callers didn't appear to want to go to the trouble of
setting it correctly.  Adding such an argument to create_projection_path
as well doesn't seem to make that better.

> Well, the point of consider_parallel is that there are some things
> that are specific to the individual path, and there are some that are
> properties of the RelOptInfo.  It seems highly desirable to check
> things that fall into the latter category exactly once, rather than
> once per path.  You seem to be fighting hard against that idea, and
> I'm pretty baffled as to why.

What I'm not happy about is that as you've got things constituted,
the GetForeignUpperPaths hook is broken so far as injecting parallel paths
is concerned, because the consider_parallel flags for the upperrels
aren't set yet when it gets called.  If we keep consider_parallel with
its present usage, we're going to have to refactor things to fix that.

> Updated patch attached.   (Official status update: I'll commit this,
> possibly over the long weekend or in any event no later than Tuesday,
> unless there are further review comments before then, in which case I
> will post a further official status update no later than Tuesday.)

Don't have time to re-read this right now, but maybe tomorrow or
Saturday.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Robert Haas
On Thu, Jun 30, 2016 at 12:04 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Here is a new patch addressing (I hope) the above comments and a few
>> other things.
>
> Some more comments:
>
> * It seems like the initialization of root->parallelModeNeeded is still
> overcomplicated and/or underexplained.  Why do you not merely set it false
> to start with, and allow it to become true when/if a Gather is added to
> the plan?  AFAICS its only point is to tell the executor that there's a
> Gather someplace.

That's pretty much right, but there are two conceptually separate
things.  The first is whether or not we actually attempt to spawn
workers, and the second is whether or not we enter parallel mode.
Entering parallel mode enables various restrictions that make spawning
workers safe, so we cannot spawn workers unless we enter parallel
mode.  We can, however, enter parallel mode without spawning any
workers, and force_parallel_mode=on does so.  The point of that is
that even if the plan doesn't end up being run inside of a worker, it
will be run with most of the same restrictions on what it can do that
would be in place if a truly parallel plan were chosen.  So we
basically do exactly what you are proposing here when
force_parallel_mode=off, but when it's turned on then we, uh, force
parallel mode.

Actually, the original remit of force_parallel_mode was precisely
that: force parallel mode.  The behavior of also forcing the plan to
run inside of a single-copy Gather mode is an additional behavior that
was added later on in the development process, but having two separate
GUCs felt like overkill.  I believe I discussed this on-list with Noah
at the time.  It's a separate issue from what this patch is about, at
any rate.

> * typo in new comment in grouping_planner: "final_rel is can be marked"
>
> +* If the input relation is not parallel-safe, then the window 
> relation
> +* can't be parallel-safe, either.  Otherwise, we need to examine the
> +* target list and windows for non-parallel-safe constructs.
> +* (XXX: Do we also need to check wflists?)
>
> No, we don't, because the wflists just contain windowfuncs extracted from
> the target list.  But I'd say "target list and active windows for..." for
> more clarity (the point being it's okay to ignore any unused window
> clauses).

Thanks, fixed.

> * root->parallelModeOK is probably not being consulted in as many places
> as it would be useful; could we use it to short-circuit any more
> has_parallel_hazard tests?

I think it short-circuits essentially all of them, because
set_rel_consider_parallel() is not called if it's false.  If the
baserel's consider_parallel flag is false, that will force every other
flag to be false as you move up the tree.  We don't need to recheck it
in join relations or upper relations because those have input
relations which should never be consider_parallel unless
root->parallelModeOK is set.  If there's ever a case where any
consider_parallel flag is true when root->parallelModeOK is false,
that's a bug.

> * I'm still not real happy with the has_parallel_hazard test added to
> apply_projection_to_path, and here is why not: if the target path is
> not projection-capable, we'll never get to that test.  We just pass
> the problem off to create_projection_path, which looks no further
> than rel->consider_parallel.  It seems like we have to add a
> has_parallel_hazard test there as well, which is a tad annoying,
> not least because all the direct callers of create_projection_path
> seem to have made the test already.

Thanks for noticing that problem; I've fixed it by inserting a test
into create_projection_path.  As far as the annoyance factor is
concerned, you obviously know that there was a flag there to reduce
the cost of that, but you insisted on removing it.  Maybe you should
consider putting it back.

> This gets back to the question of whether rel->consider_parallel is even
> useful at the level of upperrels.  As far as I can tell, that flag is
> really little more than a crutch to let scan/join paths be marked parallel
> safe or not with minimum ado.  I would rather like to get to a point where
> consider_parallel is not used for upperrels, because that avoids the
> problem that you're not setting it early enough to be useful for
> GetForeignUpperPaths().  (After that I'd like to get to a point where we
> don't have it at all, but that may not happen for 9.6.)

Well, the point of consider_parallel is that there are some things
that are specific to the individual path, and there are some that are
properties of the RelOptInfo.  It seems highly desirable to check
things that fall into the latter category exactly once, rather than
once per path.  You seem to be fighting hard against that idea, and
I'm pretty baffled as to why.  That flag avoids a significant amount
of unnecessary recomputation in baserels, in joinrels, and in some
though not all upperrels, 

Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Tom Lane
BTW, I just had another thought about reducing the cost of
has_parallel_hazard checks, to wit: you already made one pass over the
entire query to verify that there's no PARALLEL UNSAFE functions anywhere.
If that pass were to also track whether there are any PARALLEL RESTRICTED
functions anywhere, then in very many common cases, subsequent tests on
portions of the query would not have to do anything, because we'd already
know there was nothing to worry about.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-30 Thread Tom Lane
Robert Haas  writes:
> Here is a new patch addressing (I hope) the above comments and a few
> other things.

Some more comments:

* It seems like the initialization of root->parallelModeNeeded is still
overcomplicated and/or underexplained.  Why do you not merely set it false
to start with, and allow it to become true when/if a Gather is added to
the plan?  AFAICS its only point is to tell the executor that there's a
Gather someplace.

* typo in new comment in grouping_planner: "final_rel is can be marked"

+* If the input relation is not parallel-safe, then the window relation
+* can't be parallel-safe, either.  Otherwise, we need to examine the
+* target list and windows for non-parallel-safe constructs.
+* (XXX: Do we also need to check wflists?)

No, we don't, because the wflists just contain windowfuncs extracted from
the target list.  But I'd say "target list and active windows for..." for
more clarity (the point being it's okay to ignore any unused window
clauses).

* root->parallelModeOK is probably not being consulted in as many places
as it would be useful; could we use it to short-circuit any more
has_parallel_hazard tests?

* I'm still not real happy with the has_parallel_hazard test added to
apply_projection_to_path, and here is why not: if the target path is
not projection-capable, we'll never get to that test.  We just pass
the problem off to create_projection_path, which looks no further
than rel->consider_parallel.  It seems like we have to add a
has_parallel_hazard test there as well, which is a tad annoying,
not least because all the direct callers of create_projection_path
seem to have made the test already.

This gets back to the question of whether rel->consider_parallel is even
useful at the level of upperrels.  As far as I can tell, that flag is
really little more than a crutch to let scan/join paths be marked parallel
safe or not with minimum ado.  I would rather like to get to a point where
consider_parallel is not used for upperrels, because that avoids the
problem that you're not setting it early enough to be useful for
GetForeignUpperPaths().  (After that I'd like to get to a point where we
don't have it at all, but that may not happen for 9.6.)

> Unfortunately, I can't think of a clean way to make the "right" thing
> happen here.  In general, it's right to prefer a parallel-safe plan
> over an ever-so-slightly more expensive plan that isn't, because that
> might let us join it to a partial path later on.  However, if we're at
> the very top of the plan tree, that argument holds no force, so we
> could try to install some kind of exception for that case.  That seems
> like it would be fairly invasive and fairly ugly, though.  It seems
> like a better option would be to try to kludge the test case somehow
> so that the backward index scan looks at least 1% cheaper than the
> forward index scan, but I can't see how to do that.  I don't see, for
> example, that changing any of the usual costing factors would help
> much.  We could do something very specialized here, like adding a GUC
> that makes MinMaxAggPath always win, but that seems ugly, too.

If we made add_path not consider parallel safety as a preference item
when glob->parallelModeOK is false, then we could set
max_parallel_workers_per_gather to zero for this test case and the
problem would go away.  In general, that seems like it would be a good
thing for end users too: they could avoid plan changes that are made
on the basis of completely irrelevant parallel-safety.  It might be
that we don't actually need an explicit check for that in add_path,
as long as we make sure that no path ever gets marked parallel_safe
when parallelModeOK is false.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-29 Thread Robert Haas
On Mon, Jun 27, 2016 at 4:49 PM, Tom Lane  wrote:
> A few specific comments:
>
> * Can't we remove wholePlanParallelSafe altogether, in favor of just
> examining best_path->parallel_safe in standard_planner?
>
> * In grouping_planner, I'm not exactly convinced that
> final_rel->consider_parallel can just be copied up without any further
> restrictions.  Easiest counterexample is a parallel restricted function in
> LIMIT.
>
> * Why have the try_parallel_path flag at all in create_grouping_paths,
> rather than just setting or not setting grouped_rel->consider_parallel?
> If your thinking is that this is dealing with implementation restrictions
> that might not apply to paths added by an extension, then OK, but the
> variable needs to have a less generic name.  Maybe try_parallel_aggregation.
>
> * Copy/paste error in comment in create_distinct_paths: s/window/distinct/
>
> * Not following what you did to apply_projection_to_path, and the comment
> therein isn't helping.

Here is a new patch addressing (I hope) the above comments and a few
other things.  Unfortunately, it causes the by-now-familiar regression
test failure in the aggregates test:

*** /Users/rhaas/pgsql/src/test/regress/expected/aggregates.out
2016-06-29 20:04:03.0 -0400
--- /Users/rhaas/pgsql/src/test/regress/results/aggregates.out
2016-06-29 20:06:28.0 -0400
***
*** 577,590 

  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
! QUERY PLAN
! ---
!  Result
!InitPlan 1 (returns $0)
!  ->  Limit
!->  Index Only Scan Backward using tenk1_unique1 on tenk1
!  Index Cond: ((unique1 IS NOT NULL) AND (unique1 > 42000))
! (5 rows)

  select max(unique1) from tenk1 where unique1 > 42000;
   max
--- 577,588 

  explain (costs off)
select max(unique1) from tenk1 where unique1 > 42000;
!  QUERY PLAN
! 
!  Aggregate
!->  Index Only Scan using tenk1_unique1 on tenk1
!  Index Cond: (unique1 > 42000)
! (3 rows)

  select max(unique1) from tenk1 where unique1 > 42000;
   max

==

The problem, of course, is that scanning tenk1_unique1 forward to find
0 rows at the end of the index and scanning it backward to find 0 rows
at the end of the index cost almost exactly the same amount, which is
probably a good thing when you stop to think about it.  On my system,
the expected plan costs 4.31 and the plan that actually shows up costs
4.32, so we end up picking the latter because, since it does not
involve subqueries, it's parallel-safe.

It's tempting to remove the test on the theory that there's actually
little point in verifying which of two essentially-equivalent plans
get chosen, but I think that's probably missing the point: the next
test runs the same query without EXPLAIN, presumably to verify that
the MinMaxAggPath approach gets the right *answer* when there are no
matching rows, and in order for that test to mean anything, we need to
first verify that we're actually getting the plan whose execution-time
behavior we want to check.

Unfortunately, I can't think of a clean way to make the "right" thing
happen here.  In general, it's right to prefer a parallel-safe plan
over an ever-so-slightly more expensive plan that isn't, because that
might let us join it to a partial path later on.  However, if we're at
the very top of the plan tree, that argument holds no force, so we
could try to install some kind of exception for that case.  That seems
like it would be fairly invasive and fairly ugly, though.  It seems
like a better option would be to try to kludge the test case somehow
so that the backward index scan looks at least 1% cheaper than the
forward index scan, but I can't see how to do that.  I don't see, for
example, that changing any of the usual costing factors would help
much.  We could do something very specialized here, like adding a GUC
that makes MinMaxAggPath always win, but that seems ugly, too.

Suggestions?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


upperrel-consider-parallel-v2.patch
Description: invalid/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-29 Thread Robert Haas
On Wed, Jun 29, 2016 at 1:26 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 27, 2016 at 6:04 PM, Tom Lane  wrote:
>>> Huh?  The final tlist would go with the final_rel, ISTM, not the scan
>>> relation.  Maybe we have some rejiggering to do to make that true, though.
>
>> Mumble.  You're right that there are two rels involved, but I think
>> I'm still right about the substance of the problem.  I can't tell
>> whether the remainder of your email concedes that point or whether
>> we're still in disagreement.
>
> Well, I was trying to find a way that we could rely on the rel's
> consider_parallel marking rather than having to test the pathtarget as
> such, but I concluded that we couldn't do that.  Sorry if thinking
> out loud confused you.

OK, no problem.  I was arguing from the beginning that we couldn't
make that work, so it sounds like we are now in agreement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-29 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 27, 2016 at 6:04 PM, Tom Lane  wrote:
>> Huh?  The final tlist would go with the final_rel, ISTM, not the scan
>> relation.  Maybe we have some rejiggering to do to make that true, though.

> Mumble.  You're right that there are two rels involved, but I think
> I'm still right about the substance of the problem.  I can't tell
> whether the remainder of your email concedes that point or whether
> we're still in disagreement.

Well, I was trying to find a way that we could rely on the rel's
consider_parallel marking rather than having to test the pathtarget as
such, but I concluded that we couldn't do that.  Sorry if thinking
out loud confused you.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-29 Thread Robert Haas
On Mon, Jun 27, 2016 at 6:04 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 27, 2016 at 5:28 PM, Tom Lane  wrote:
>>> Seems to me that it should generally be the case that consider_parallel
>>> would already be clear on the parent rel if the tlist isn't parallel safe,
>>> and if it isn't we probably have a bug elsewhere.  If it makes you feel
>>> better, maybe you could add Assert(!has_parallel_hazard(...)) here?
>
>> I don't see that this is true.  If someone does SELECT
>> pg_backend_pid() FROM pgbench_accounts, there's only one RelOptInfo
>> and nothing to clear consider_parallel for it anywhere else.
>
> Huh?  The final tlist would go with the final_rel, ISTM, not the scan
> relation.  Maybe we have some rejiggering to do to make that true, though.

Mumble.  You're right that there are two rels involved, but I think
I'm still right about the substance of the problem.  I can't tell
whether the remainder of your email concedes that point or whether
we're still in disagreement.

In that example, SELECT pg_backend_pid() FROM pgbench_accounts, we're
first going to form a path for scanning pgbench_accounts.  The rel for
pgbench_accounts will be marked parallel_safe because it's just a scan
of a relation outputting some number (possibly 0) of Vars.  That rel
becomes the final scan/join rel, and the path or paths for that rel
are parallel-safe.  Now, when we apply the final tlist to those paths,
they are no longer parallel-safe.  apply_projection_to_path() has got
to realize that.

> You could still save something by writing code along the line of
> if (path->parallel_safe &&
> has_parallel_hazard(...))
> path->parallel_safe = false;
> so as not to run has_parallel_hazard in the case where we already know
> we lost.

I agree, and that does seem worth doing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 27, 2016 at 5:28 PM, Tom Lane  wrote:
>> Seems to me that it should generally be the case that consider_parallel
>> would already be clear on the parent rel if the tlist isn't parallel safe,
>> and if it isn't we probably have a bug elsewhere.  If it makes you feel
>> better, maybe you could add Assert(!has_parallel_hazard(...)) here?

> I don't see that this is true.  If someone does SELECT
> pg_backend_pid() FROM pgbench_accounts, there's only one RelOptInfo
> and nothing to clear consider_parallel for it anywhere else.

Huh?  The final tlist would go with the final_rel, ISTM, not the scan
relation.  Maybe we have some rejiggering to do to make that true, though.

> I've really been wondering whether there ought to be a separate
> UPPERREL_FOO whose only purpose is to handle applying scanjoin_target
> to the topmost scan/join rel.

That's another way of thinking about it, I guess.

> Effectively, the current code is trying
> to do that transformation in place.  We start with a scan/join rel
> that emits whatever it naturally emits and then frob all of the paths
> and partial paths to emit the scanjoin_target.  But this seems quite
> cumbersome.

It'd be no less cumbersome, because you'd end up with all those same paths
surviving into the FOO upperrel.  But it might be logically cleaner.


(Thinks for a bit...)  Actually, scratch that.  It was very intentional on
my part that different Paths for the same relation might produce different
tlists.  Without that, there's no way that we're going to get a solution
that allows extracting index expression values from index-only scans in
nontrivial plans.  Otherwise I wouldn't have introduced PathTarget to
begin with, because we already had a perfectly good way of representing a
one-size-fits-all result tlist for each RelOptInfo.

So it seems like we should not introduce a dependency here that assumes
that all Paths for a given rel have equivalent parallel_safe settings.
Maybe it'd be okay for the special case of index expressions, because
they are almost certainly going to be parallel safe, but in general it's
a restriction we don't want.

You could still save something by writing code along the line of
if (path->parallel_safe &&
has_parallel_hazard(...))
path->parallel_safe = false;
so as not to run has_parallel_hazard in the case where we already know
we lost.

Doing more than this would probably involve caching parallel-safety
status in PathTarget itself, which is certainly doable but we should
measure first to see if it's worth the trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Robert Haas
On Mon, Jun 27, 2016 at 5:28 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 27, 2016 at 4:49 PM, Tom Lane  wrote:
>>> * Not following what you did to apply_projection_to_path, and the comment
>>> therein isn't helping.
>
>> Gee, I wonder why not?  :-)
>
>> The basic problem here is that applying a projection to a path can
>> render a formerly parallel-safe path no longer parallel-safe.  If we
>> jam a parallel-restricted target list into a formerly parallel-safe
>> path, we'd better also clear path->parallel_safe.  Currently,
>> apply_projection_to_path only needs to call has_parallel_hazard for an
>> input which is a GatherPath, which isn't too expensive because most
>> paths are not GatherPaths and if we get one that is, well, we can hope
>> parallel query will win enough during execution to make up for the
>> extra planning cost.  But if we want the consider_parallel and
>> parallel_safe flags to be set correctly for all upper rels and paths,
>> it seems that we need to do it always - hence the dismayed comment.
>> Thoughts?
>
> Seems to me that it should generally be the case that consider_parallel
> would already be clear on the parent rel if the tlist isn't parallel safe,
> and if it isn't we probably have a bug elsewhere.  If it makes you feel
> better, maybe you could add Assert(!has_parallel_hazard(...)) here?

I don't see that this is true.  If someone does SELECT
pg_backend_pid() FROM pgbench_accounts, there's only one RelOptInfo
and nothing to clear consider_parallel for it anywhere else.

I've really been wondering whether there ought to be a separate
UPPERREL_FOO whose only purpose is to handle applying scanjoin_target
to the topmost scan/join rel.  Effectively, the current code is trying
to do that transformation in place.  We start with a scan/join rel
that emits whatever it naturally emits and then frob all of the paths
and partial paths to emit the scanjoin_target.  But this seems quite
cumbersome.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 27, 2016 at 4:49 PM, Tom Lane  wrote:
>> * Not following what you did to apply_projection_to_path, and the comment
>> therein isn't helping.

> Gee, I wonder why not?  :-)

> The basic problem here is that applying a projection to a path can
> render a formerly parallel-safe path no longer parallel-safe.  If we
> jam a parallel-restricted target list into a formerly parallel-safe
> path, we'd better also clear path->parallel_safe.  Currently,
> apply_projection_to_path only needs to call has_parallel_hazard for an
> input which is a GatherPath, which isn't too expensive because most
> paths are not GatherPaths and if we get one that is, well, we can hope
> parallel query will win enough during execution to make up for the
> extra planning cost.  But if we want the consider_parallel and
> parallel_safe flags to be set correctly for all upper rels and paths,
> it seems that we need to do it always - hence the dismayed comment.
> Thoughts?

Seems to me that it should generally be the case that consider_parallel
would already be clear on the parent rel if the tlist isn't parallel safe,
and if it isn't we probably have a bug elsewhere.  If it makes you feel
better, maybe you could add Assert(!has_parallel_hazard(...)) here?

I guess this means that apply_projection_to_path would need to clear
parallel_safe if rel->consider_parallel isn't true.  This would correspond
to situations where we are taking a parallel-safe path for a lower-level
relation that has consider_parallel true, and repurposing it for a new
upperrel that has consider_parallel false.  Maybe it'd be better to
not do that but just force use of a separate ProjectionPath if the
parallel_safe flag needs to change.

(I think 8b9d323cb may have made this a little less messy than it
was when you did your draft patch, btw.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Robert Haas
On Mon, Jun 27, 2016 at 4:49 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Mon, Jun 27, 2016 at 3:35 PM, Tom Lane  wrote:
>>> Oh, I thought you were still actively working on it.  What patch do
>>> you want me to review?
>
>> I'm looking for an opinion on the WIP patch attached to:
>> https://www.postgresql.org/message-id/ca+tgmozwjb9eaibj-meeaq913wkkoz4aq7vqncfrdls9wyh...@mail.gmail.com
>
> OK, I took a quick look.  Some of the details are obsolete due to my
> over-the-weekend hacking on parallel aggregation, but I think generally
> you have the right idea that we should set upperrel consider_parallel
> flags on the basis of "input rel has consider_parallel = true AND there
> are no parallel hazards in operations to be performed at this level".

OK, great.  Thanks.

> A few specific comments:
>
> * Can't we remove wholePlanParallelSafe altogether, in favor of just
> examining best_path->parallel_safe in standard_planner?

Yes, I think we can.  Before the upper planner used paths, we needed a
way to get wholePlanParallelSafe = false even if every generated path
was parallel-safe, but now that all planning stages have paths, we can
get rid of that.

> * In grouping_planner, I'm not exactly convinced that
> final_rel->consider_parallel can just be copied up without any further
> restrictions.  Easiest counterexample is a parallel restricted function in
> LIMIT.

OK, will look.

> * Why have the try_parallel_path flag at all in create_grouping_paths,
> rather than just setting or not setting grouped_rel->consider_parallel?
> If your thinking is that this is dealing with implementation restrictions
> that might not apply to paths added by an extension, then OK, but the
> variable needs to have a less generic name.  Maybe try_parallel_aggregation.

Suppose all of the relevant functions are parallel-safe, but the
aggregates don't have combine functions.  In that case,
consider_parallel ought to be true, because parallel strategies are OK
as a general matter, but the one and only parallel strategy we have
today - two-phase aggregation - will not work.

> * Copy/paste error in comment in create_distinct_paths: s/window/distinct/

OK, will fix.

> * Not following what you did to apply_projection_to_path, and the comment
> therein isn't helping.

Gee, I wonder why not?  :-)

The basic problem here is that applying a projection to a path can
render a formerly parallel-safe path no longer parallel-safe.  If we
jam a parallel-restricted target list into a formerly parallel-safe
path, we'd better also clear path->parallel_safe.  Currently,
apply_projection_to_path only needs to call has_parallel_hazard for an
input which is a GatherPath, which isn't too expensive because most
paths are not GatherPaths and if we get one that is, well, we can hope
parallel query will win enough during execution to make up for the
extra planning cost.  But if we want the consider_parallel and
parallel_safe flags to be set correctly for all upper rels and paths,
it seems that we need to do it always - hence the dismayed comment.
Thoughts?

>> It may not be correct in detail, but I'd like to know whether you
>> think it's going in the generally correct direction and what major
>> concerns you might have before spending more time on it.  Also, I'd
>> like to know whether you think it's something we should try to put
>> into 9.6 or whether you think we should leave it for next cycle.
>
> I think we should try to get this right in 9.6.  For one thing, the
> more stuff we can easily exercise in parallel mode, the more likely
> we are to find bugs before they reach the field.

OK.

(Official status update: I will post an updated version of this patch
by Thursday.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Tom Lane
Robert Haas  writes:
> On Mon, Jun 27, 2016 at 3:35 PM, Tom Lane  wrote:
>> Oh, I thought you were still actively working on it.  What patch do
>> you want me to review?

> I'm looking for an opinion on the WIP patch attached to:
> https://www.postgresql.org/message-id/ca+tgmozwjb9eaibj-meeaq913wkkoz4aq7vqncfrdls9wyh...@mail.gmail.com

OK, I took a quick look.  Some of the details are obsolete due to my
over-the-weekend hacking on parallel aggregation, but I think generally
you have the right idea that we should set upperrel consider_parallel
flags on the basis of "input rel has consider_parallel = true AND there
are no parallel hazards in operations to be performed at this level".
A few specific comments:

* Can't we remove wholePlanParallelSafe altogether, in favor of just
examining best_path->parallel_safe in standard_planner?

* In grouping_planner, I'm not exactly convinced that
final_rel->consider_parallel can just be copied up without any further
restrictions.  Easiest counterexample is a parallel restricted function in
LIMIT.

* Why have the try_parallel_path flag at all in create_grouping_paths,
rather than just setting or not setting grouped_rel->consider_parallel?
If your thinking is that this is dealing with implementation restrictions
that might not apply to paths added by an extension, then OK, but the
variable needs to have a less generic name.  Maybe try_parallel_aggregation.

* Copy/paste error in comment in create_distinct_paths: s/window/distinct/

* Not following what you did to apply_projection_to_path, and the comment
therein isn't helping.

> It may not be correct in detail, but I'd like to know whether you
> think it's going in the generally correct direction and what major
> concerns you might have before spending more time on it.  Also, I'd
> like to know whether you think it's something we should try to put
> into 9.6 or whether you think we should leave it for next cycle.

I think we should try to get this right in 9.6.  For one thing, the
more stuff we can easily exercise in parallel mode, the more likely
we are to find bugs before they reach the field.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Robert Haas
On Mon, Jun 27, 2016 at 3:35 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I'm not sure how to proceed here.  I have asked Tom several times to
>> look at the WIP patch and offer comments, but he so far has not done
>> so.
>
> Oh, I thought you were still actively working on it.  What patch do
> you want me to review?

I'm looking for an opinion on the WIP patch attached to:

https://www.postgresql.org/message-id/ca+tgmozwjb9eaibj-meeaq913wkkoz4aq7vqncfrdls9wyh...@mail.gmail.com

It may not be correct in detail, but I'd like to know whether you
think it's going in the generally correct direction and what major
concerns you might have before spending more time on it.  Also, I'd
like to know whether you think it's something we should try to put
into 9.6 or whether you think we should leave it for next cycle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fixing consider_parallel for upper planner rels

2016-06-27 Thread Tom Lane
Robert Haas  writes:
> I'm not sure how to proceed here.  I have asked Tom several times to
> look at the WIP patch and offer comments, but he so far has not done
> so.

Oh, I thought you were still actively working on it.  What patch do
you want me to review?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers