Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On Fri, Oct 27, 2017 at 1:36 AM, Tom Lanewrote: > David Rowley writes: >> On 27 October 2017 at 01:44, Ashutosh Bapat >> wrote: >>> I think Antonin has a point. The join processing may deem some base >>> relations dummy (See populate_joinrel_with_paths()). So an Append >>> relation which had multiple child alive at the end of >>> set_append_rel_size() might ultimately have only one child after >>> partition-wise join has worked on it. > > TBH, that means partition-wise join planning is broken and needs to be > redesigned. It's impossible that we're going to make sane planning > choices if the sizes of relations change after we've already done some > planning work. The mark_dummy_rel() call that marks a joining relation dummy was added by 719012e013f10f72938520c46889c52df40fa329. The joining relations are planned before the joins they participate in are planned. Further some of the joins in which it participates might have been planned before the joining pair, which causes it to be marked dummy, is processed by populate_joinrel_with_paths(). So we already have places where the sizes of relation changes during planning. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On Fri, Oct 27, 2017 at 12:49 AM, David Rowleywrote: > On 27 October 2017 at 01:44, Ashutosh Bapat > wrote: >> I think Antonin has a point. The join processing may deem some base >> relations dummy (See populate_joinrel_with_paths()). So an Append >> relation which had multiple child alive at the end of >> set_append_rel_size() might ultimately have only one child after >> partition-wise join has worked on it. > > hmm, I see. partition-wise join is able to remove eliminate partitions > on the other side of the join that can't be matched to: > > # set enable_partition_wise_join=on; > SET > # explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a > where ab1.a between 1 and 2 and ab2.a between 1 and 10001; >QUERY PLAN > > Aggregate (cost=0.00..0.01 rows=1 width=8) >-> Result (cost=0.00..0.00 rows=0 width=0) > One-Time Filter: false > (3 rows) > > # \d+ ab > Table "public.ab" > Column | Type | Collation | Nullable | Default | Storage | Stats > target | Description > +-+---+--+-+-+--+- > a | integer | | | | plain | | > b | integer | | | | plain | | > Partition key: RANGE (a) > Partitions: ab_p1 FOR VALUES FROM (1) TO (1), > ab_p2 FOR VALUES FROM (1) TO (2) > IIUC, ab1_p2 and ab2_p1 are marked dummy by constraint exclusion. Because of partition-wise join when the corresponding partitions are joined, their inner joins are marked dummy resulting in an empty join. This isn't the case of a join processing marking one of the joining relations dummy, that I was referring to. But I think it's relevant here since partition-wise join might create single child joins this way. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
David Rowleywrites: > On 27 October 2017 at 01:44, Ashutosh Bapat > wrote: >> I think Antonin has a point. The join processing may deem some base >> relations dummy (See populate_joinrel_with_paths()). So an Append >> relation which had multiple child alive at the end of >> set_append_rel_size() might ultimately have only one child after >> partition-wise join has worked on it. TBH, that means partition-wise join planning is broken and needs to be redesigned. It's impossible that we're going to make sane planning choices if the sizes of relations change after we've already done some planning work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On 27 October 2017 at 01:44, Ashutosh Bapatwrote: > I think Antonin has a point. The join processing may deem some base > relations dummy (See populate_joinrel_with_paths()). So an Append > relation which had multiple child alive at the end of > set_append_rel_size() might ultimately have only one child after > partition-wise join has worked on it. hmm, I see. partition-wise join is able to remove eliminate partitions on the other side of the join that can't be matched to: # set enable_partition_wise_join=on; SET # explain select count(*) from ab ab1 inner join ab ab2 on ab1.a=ab2.a where ab1.a between 1 and 2 and ab2.a between 1 and 10001; QUERY PLAN Aggregate (cost=0.00..0.01 rows=1 width=8) -> Result (cost=0.00..0.00 rows=0 width=0) One-Time Filter: false (3 rows) # \d+ ab Table "public.ab" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description +-+---+--+-+-+--+- a | integer | | | | plain | | b | integer | | | | plain | | Partition key: RANGE (a) Partitions: ab_p1 FOR VALUES FROM (1) TO (1), ab_p2 FOR VALUES FROM (1) TO (2) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
David Rowleywrites: > It seems like a good idea for the planner not to generate Append or > MergeAppend paths when the node contains just a single subpath. > ... > Perhaps there is also a "Method 3" which I've not thought about which > would be much cleaner. Have you considered turning AppendPath with a single child into more of a special case? I think this is morally equivalent to your ProxyPath proposal, but it would take less new boilerplate code to support. If you taught create_append_path to inherit the child's pathkeys when there's only one child, and taught createplan.c to skip making the useless Append node, I think you might be nearly done. This might seem a bit ugly, but considering that zero-child AppendPaths are already a special case (and a much bigger one), I don't have much of a problem with decreeing that one-child is also a special case. As an example of why this might be better than a separate ProxyPath node type, consider this call in add_paths_to_append_rel: /* * If we found unparameterized paths for all children, build an unordered, * unparameterized Append path for the rel. (Note: this is correct even * if we have zero or one live subpath due to constraint exclusion.) */ if (subpaths_valid) add_path(rel, (Path *) create_append_path(rel, subpaths, NULL, 0, partitioned_rels)); That comment could stand a bit of adjustment with this approach, but the code itself requires no extra work. Now, you would have to do something about this in create_append_plan: /* * XXX ideally, if there's just one child, we'd not bother to generate an * Append node but just return the single child. At the moment this does * not work because the varno of the child scan plan won't match the * parent-rel Vars it'll be asked to emit. */ but that's going to come out in the wash someplace, no matter what you do. Leaving it for the last minute is likely to reduce the number of places you have to teach about this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On Thu, Oct 26, 2017 at 5:07 PM, David Rowleywrote: > On 26 October 2017 at 23:42, Antonin Houska wrote: >> David Rowley wrote: >> >>> Method 1: >>> >>> In set_append_rel_size() detect when just a single subpath would be >>> added to the Append path. >> >> I spent some time reviewing the partition-wise join patch during the last CF >> and have an impression that set_append_rel_size() is called too early to find >> out how many child paths will eventually exist if Append represents a join of >> a partitioned table. I think the partition matching takes place at later >> stage >> and that determines the actual number of partitions, and therefore the number >> of Append subpaths. > > I understand that we must wait until set_append_rel_size() is being > called on the RELOPT_BASEREL since partitions may themselves be > partitioned tables and we'll only know what's left after we've > recursively checked all partitions of the baserel. > > I've not studied the partition-wise join code yet, but if we've > eliminated all but one partition in set_append_rel_size() then I > imagine we'd just need to ensure partition-wise join is not attempted > since we'd be joining directly to the only partition anyway. > I think Antonin has a point. The join processing may deem some base relations dummy (See populate_joinrel_with_paths()). So an Append relation which had multiple child alive at the end of set_append_rel_size() might ultimately have only one child after partition-wise join has worked on it. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On 26 October 2017 at 23:42, Antonin Houskawrote: > David Rowley wrote: > >> Method 1: >> >> In set_append_rel_size() detect when just a single subpath would be >> added to the Append path. > > I spent some time reviewing the partition-wise join patch during the last CF > and have an impression that set_append_rel_size() is called too early to find > out how many child paths will eventually exist if Append represents a join of > a partitioned table. I think the partition matching takes place at later stage > and that determines the actual number of partitions, and therefore the number > of Append subpaths. I understand that we must wait until set_append_rel_size() is being called on the RELOPT_BASEREL since partitions may themselves be partitioned tables and we'll only know what's left after we've recursively checked all partitions of the baserel. I've not studied the partition-wise join code yet, but if we've eliminated all but one partition in set_append_rel_size() then I imagine we'd just need to ensure partition-wise join is not attempted since we'd be joining directly to the only partition anyway. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On 26 October 2017 at 23:30, Robert Haaswrote: > On Wed, Oct 25, 2017 at 11:59 PM, David Rowley > wrote: >> As of today, because we include this needless [Merge]Append node, we >> cannot parallelise scans below the Append. > > Without disputing your general notion that singe-child Append or > MergeAppend nodes are a pointless waste, how does a such a needless > node prevent parallelizing scans beneath it? hmm, I think I was wrong about that now. I had been looking over the regression test diffs after having made tenk1 a partitioned table with a single partition containing all the rows, but it looks like I read the diff backwards. The planner actually parallelized the Append version, not the non-Append version, like I had previously thought. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
David Rowleywrote: > Method 1: > > In set_append_rel_size() detect when just a single subpath would be > added to the Append path. I spent some time reviewing the partition-wise join patch during the last CF and have an impression that set_append_rel_size() is called too early to find out how many child paths will eventually exist if Append represents a join of a partitioned table. I think the partition matching takes place at later stage and that determines the actual number of partitions, and therefore the number of Append subpaths. -- Antonin Houska Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de, http://www.cybertec.at -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On Wed, Oct 25, 2017 at 11:59 PM, David Rowleywrote: > As of today, because we include this needless [Merge]Append node, we > cannot parallelise scans below the Append. Without disputing your general notion that singe-child Append or MergeAppend nodes are a pointless waste, how does a such a needless node prevent parallelizing scans beneath it? > Invent a ProxyPath concept that allows us to add Paths belonging to > one relation to another relation. The ProxyPaths can have > translation_vars to translate targetlists to reference the correct > Vars. These ProxyPaths could exist up as far as createplan, where we > could perform the translation and build the ProxyPath's subnode > instead. This idea sounds like it might have some legs, although I'm not sure "proxy" is really the right terminology. Like both you and Ashutosh, I suspect that there might be some other applications for this. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
On Thu, Oct 26, 2017 at 3:29 AM, David Rowleywrote: > However, method 2 appears to also require some Var > translation in Path targetlists which contain this Proxy path, either > that or some global Var replacement would need to be done during > setrefs. > For inheritance and partitioning, we translate parent expressions to child expression by applying Var translations. The translated expressions differ only in the Var nodes (at least for partitioning this true, and I believe it to be true for inheritance). I have been toying with an idea of having some kind of a (polymorphic?) Var node which can represent parent and children. That will greatly reduce the need to translate the expression trees and will also help in this implementation. I am wondering, if ProxyPath could be helpful in avoiding the cost of reparameterize_path_by_child() introduced for partition-wise join. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers