Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-30 Thread Ashutosh Bapat
On Fri, Oct 27, 2017 at 1:36 AM, Tom Lane  wrote:
> David Rowley  writes:
>> On 27 October 2017 at 01:44, Ashutosh Bapat
>>  wrote:
>>> I think Antonin has a point. The join processing may deem some base
>>> relations dummy (See populate_joinrel_with_paths()). So an Append
>>> relation which had multiple child alive at the end of
>>> set_append_rel_size() might ultimately have only one child after
>>> partition-wise join has worked on it.
>
> TBH, that means partition-wise join planning is broken and needs to be
> redesigned.  It's impossible that we're going to make sane planning
> choices if the sizes of relations change after we've already done some
> planning work.

The mark_dummy_rel() call that marks a joining relation dummy was
added by 719012e013f10f72938520c46889c52df40fa329. The joining
relations are planned before the joins they participate in are
planned. Further some of the joins in which it participates might have
been planned before the joining pair, which causes it to be marked
dummy, is processed by populate_joinrel_with_paths(). So we already
have places where the sizes of relation changes during planning.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-29 Thread Ashutosh Bapat
On Fri, Oct 27, 2017 at 12:49 AM, David Rowley
 wrote:
> On 27 October 2017 at 01:44, Ashutosh Bapat
>  wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.
>
> hmm, I see. partition-wise join is able to remove eliminate partitions
> on the other side of the join that can't be matched to:
>
> # set enable_partition_wise_join=on;
> SET
> # explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
> where ab1.a between 1 and 2 and ab2.a between 1 and 10001;
>QUERY PLAN
> 
>  Aggregate  (cost=0.00..0.01 rows=1 width=8)
>->  Result  (cost=0.00..0.00 rows=0 width=0)
>  One-Time Filter: false
> (3 rows)
>
> # \d+ ab
> Table "public.ab"
>  Column |  Type   | Collation | Nullable | Default | Storage | Stats
> target | Description
> +-+---+--+-+-+--+-
>  a  | integer |   |  | | plain   |  |
>  b  | integer |   |  | | plain   |  |
> Partition key: RANGE (a)
> Partitions: ab_p1 FOR VALUES FROM (1) TO (1),
> ab_p2 FOR VALUES FROM (1) TO (2)
>

IIUC, ab1_p2 and ab2_p1 are marked dummy by constraint exclusion.
Because of partition-wise join when the corresponding partitions are
joined, their inner joins are marked dummy resulting in an empty join.
This isn't the case of a join processing marking one of the joining
relations dummy, that I was referring to. But I think it's relevant
here since partition-wise join might create single child joins this
way.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Tom Lane
David Rowley  writes:
> On 27 October 2017 at 01:44, Ashutosh Bapat
>  wrote:
>> I think Antonin has a point. The join processing may deem some base
>> relations dummy (See populate_joinrel_with_paths()). So an Append
>> relation which had multiple child alive at the end of
>> set_append_rel_size() might ultimately have only one child after
>> partition-wise join has worked on it.

TBH, that means partition-wise join planning is broken and needs to be
redesigned.  It's impossible that we're going to make sane planning
choices if the sizes of relations change after we've already done some
planning work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread David Rowley
On 27 October 2017 at 01:44, Ashutosh Bapat
 wrote:
> I think Antonin has a point. The join processing may deem some base
> relations dummy (See populate_joinrel_with_paths()). So an Append
> relation which had multiple child alive at the end of
> set_append_rel_size() might ultimately have only one child after
> partition-wise join has worked on it.

hmm, I see. partition-wise join is able to remove eliminate partitions
on the other side of the join that can't be matched to:

# set enable_partition_wise_join=on;
SET
# explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a
where ab1.a between 1 and 2 and ab2.a between 1 and 10001;
   QUERY PLAN

 Aggregate  (cost=0.00..0.01 rows=1 width=8)
   ->  Result  (cost=0.00..0.00 rows=0 width=0)
 One-Time Filter: false
(3 rows)

# \d+ ab
Table "public.ab"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats
target | Description
+-+---+--+-+-+--+-
 a  | integer |   |  | | plain   |  |
 b  | integer |   |  | | plain   |  |
Partition key: RANGE (a)
Partitions: ab_p1 FOR VALUES FROM (1) TO (1),
ab_p2 FOR VALUES FROM (1) TO (2)



-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Tom Lane
David Rowley  writes:
> It seems like a good idea for the planner not to generate Append or
> MergeAppend paths when the node contains just a single subpath.
> ...
> Perhaps there is also a "Method 3" which I've not thought about which
> would be much cleaner.

Have you considered turning AppendPath with a single child into more
of a special case?  I think this is morally equivalent to your ProxyPath
proposal, but it would take less new boilerplate code to support.
If you taught create_append_path to inherit the child's pathkeys
when there's only one child, and taught createplan.c to skip making the
useless Append node, I think you might be nearly done.

This might seem a bit ugly, but considering that zero-child AppendPaths
are already a special case (and a much bigger one), I don't have much
of a problem with decreeing that one-child is also a special case.

As an example of why this might be better than a separate ProxyPath
node type, consider this call in add_paths_to_append_rel:

/*
 * If we found unparameterized paths for all children, build an unordered,
 * unparameterized Append path for the rel.  (Note: this is correct even
 * if we have zero or one live subpath due to constraint exclusion.)
 */
if (subpaths_valid)
add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0,
  partitioned_rels));

That comment could stand a bit of adjustment with this approach, but
the code itself requires no extra work.

Now, you would have to do something about this in create_append_plan:

/*
 * XXX ideally, if there's just one child, we'd not bother to generate an
 * Append node but just return the single child.  At the moment this does
 * not work because the varno of the child scan plan won't match the
 * parent-rel Vars it'll be asked to emit.
 */

but that's going to come out in the wash someplace, no matter what you do.
Leaving it for the last minute is likely to reduce the number of places
you have to teach about this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Ashutosh Bapat
On Thu, Oct 26, 2017 at 5:07 PM, David Rowley
 wrote:
> On 26 October 2017 at 23:42, Antonin Houska  wrote:
>> David Rowley  wrote:
>>
>>> Method 1:
>>>
>>> In set_append_rel_size() detect when just a single subpath would be
>>> added to the Append path.
>>
>> I spent some time reviewing the partition-wise join patch during the last CF
>> and have an impression that set_append_rel_size() is called too early to find
>> out how many child paths will eventually exist if Append represents a join of
>> a partitioned table. I think the partition matching takes place at later 
>> stage
>> and that determines the actual number of partitions, and therefore the number
>> of Append subpaths.
>
> I understand that we must wait until set_append_rel_size() is being
> called on the RELOPT_BASEREL since partitions may themselves be
> partitioned tables and we'll only know what's left after we've
> recursively checked all partitions of the baserel.
>
> I've not studied the partition-wise join code yet, but if we've
> eliminated all but one partition in set_append_rel_size() then I
> imagine we'd just need to ensure partition-wise join is not attempted
> since we'd be joining directly to the only partition anyway.
>

I think Antonin has a point. The join processing may deem some base
relations dummy (See populate_joinrel_with_paths()). So an Append
relation which had multiple child alive at the end of
set_append_rel_size() might ultimately have only one child after
partition-wise join has worked on it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread David Rowley
On 26 October 2017 at 23:42, Antonin Houska  wrote:
> David Rowley  wrote:
>
>> Method 1:
>>
>> In set_append_rel_size() detect when just a single subpath would be
>> added to the Append path.
>
> I spent some time reviewing the partition-wise join patch during the last CF
> and have an impression that set_append_rel_size() is called too early to find
> out how many child paths will eventually exist if Append represents a join of
> a partitioned table. I think the partition matching takes place at later stage
> and that determines the actual number of partitions, and therefore the number
> of Append subpaths.

I understand that we must wait until set_append_rel_size() is being
called on the RELOPT_BASEREL since partitions may themselves be
partitioned tables and we'll only know what's left after we've
recursively checked all partitions of the baserel.

I've not studied the partition-wise join code yet, but if we've
eliminated all but one partition in set_append_rel_size() then I
imagine we'd just need to ensure partition-wise join is not attempted
since we'd be joining directly to the only partition anyway.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread David Rowley
On 26 October 2017 at 23:30, Robert Haas  wrote:
> On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
>  wrote:
>> As of today, because we include this needless [Merge]Append node, we
>> cannot parallelise scans below the Append.
>
> Without disputing your general notion that singe-child Append or
> MergeAppend nodes are a pointless waste, how does a such a needless
> node prevent parallelizing scans beneath it?

hmm, I think I was wrong about that now. I had been looking over the
regression test diffs after having made tenk1 a partitioned table with
a single partition containing all the rows, but it looks like I read
the diff backwards. The planner actually parallelized the Append
version, not the non-Append version, like I had previously thought.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Antonin Houska
David Rowley  wrote:

> Method 1:
> 
> In set_append_rel_size() detect when just a single subpath would be
> added to the Append path.

I spent some time reviewing the partition-wise join patch during the last CF
and have an impression that set_append_rel_size() is called too early to find
out how many child paths will eventually exist if Append represents a join of
a partitioned table. I think the partition matching takes place at later stage
and that determines the actual number of partitions, and therefore the number
of Append subpaths.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Robert Haas
On Wed, Oct 25, 2017 at 11:59 PM, David Rowley
 wrote:
> As of today, because we include this needless [Merge]Append node, we
> cannot parallelise scans below the Append.

Without disputing your general notion that singe-child Append or
MergeAppend nodes are a pointless waste, how does a such a needless
node prevent parallelizing scans beneath it?

> Invent a ProxyPath concept that allows us to add Paths belonging to
> one relation to another relation. The ProxyPaths can have
> translation_vars to translate targetlists to reference the correct
> Vars. These ProxyPaths could exist up as far as createplan, where we
> could perform the translation and build the ProxyPath's subnode
> instead.

This idea sounds like it might have some legs, although I'm not sure
"proxy" is really the right terminology.

Like both you and Ashutosh, I suspect that there might be some other
applications for this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2017-10-26 Thread Ashutosh Bapat
On Thu, Oct 26, 2017 at 3:29 AM, David Rowley
 wrote:
> However, method 2 appears to also require some Var
> translation in Path targetlists which contain this Proxy path, either
> that or some global Var replacement would need to be done during
> setrefs.
>

For inheritance and partitioning, we translate parent expressions to
child expression by applying Var translations. The translated
expressions differ only in the Var nodes (at least for partitioning
this true, and I believe it to be true for inheritance). I have been
toying with an idea of having some kind of a (polymorphic?) Var node
which can represent parent and children. That will greatly reduce the
need to translate the expression trees and will also help in this
implementation. I am wondering, if ProxyPath could be helpful in
avoiding the cost of reparameterize_path_by_child() introduced for
partition-wise join.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers