Hi Amit: Thanks for your review! On Thu, Mar 4, 2021 at 5:07 PM Amit Langote <amitlangot...@gmail.com> wrote:
> Hi Andy, > > On Sun, Jan 24, 2021 at 7:34 PM Andy Fan <zhihui.fan1...@gmail.com> wrote: > > I recently found a use case like this. SELECT * FROM p, q WHERE > p.partkey = > > q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either > planning time > > partition prune or init partition prune. Even though we have run-time > > partition pruning work at last, it is too late in some cases since we > have > > to init all the plan nodes in advance. In my case, there are 10+ > > partitioned relation in one query and the execution time is short, so > the > > init plan a lot of plan nodes cares a lot. > > > > The attached patches fix this issue. It just get the "p.partkey = q.colx" > > case in root->eq_classes or rel->joinlist (outer join), and then check > if there > > is some baserestrictinfo in another relation which can be used for > partition > > pruning. To make the things easier, both partkey and colx must be Var > > expression in implementation. > > > > - v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch > > > > Just some existing refactoring and extending ChangeVarNodes to be able > > to change var->attno. > > > > - v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch > > IIUC, your proposal is to transpose the "q.b in (1, 2)" in the > following query as "p.a in (1, 2)" and pass it down as a pruning qual > for p: > > select * from p, q where p.a = q.b and q.b in (1, 2); > > or "(q.b = 1 or q.b = 2)" in the following query as "(p.a = 1 or p.a = 2)": > > select * from p, q where p.a = q.b and (q.b = 1 or q.b = 2); > > Yes, you understand me correctly. > While that transposition sounds *roughly* valid, I have some questions > about the approach: > > * If the transposed quals are assumed valid to use for partition > pruning, could they also not be used by, say, the surviving > partitions' index scan paths? So, perhaps, it doesn't seem right that > partprune.c builds the clauses on-the-fly for pruning and dump them > once done. > > * On that last part, I wonder if partprune.c isn't the wrong place to > determine that "q.b in (1, 2)" and "p.a in (1, 2)" are in fact > equivalent. That sort of thing is normally done in the phase of > planning when distribute_qual_to_rels() runs and any equivalences > found stored in PlannerInfo.eq_classes. Have you investigated why the > process_ machinery doesn't support working with ScalarArrayOpExpr and > BoolExpr to begin with? > > * Or maybe have you considered generalizing what > build_implied_pruning_quals() does so that other places like > indxpath.c can use the facility? > > Actually at the beginning of this work, I do think I should put the implied quals to baserestictinfo in the distribute_qual_for_rels stage. That probably can fix all the issues you reported. However that probably more complex than what I did with more risks and I have a very limited timeline to handle the real custom issue, so I choose this strategy. But it is the time to re-think the baserestrictinfo way now. I will spend some time in this direction, Thank you for this kind of push-up:) I just checked this stuff on Oracle, Oracle does use this strategy. SQL> explain plan for select * from t1, t2 where t1.a = t2.a and t1.a > 2; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Plan hash value: 1838229974 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 52 | 4 (0)| 00:00:01 | |* 1 | HASH JOIN | | 1 | 52 | 4 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| T1 | 1 | 26 | 2 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| T2 | 1 | 26 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------- PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T1"."A"="T2"."A") * 2 - filter("T1"."A">2) 3 - filter("T2"."A">2)* 17 rows selected. postgres=# explain (costs off) select * from t1, t2 where t1.a = t2.a and t1.a > 2; QUERY PLAN ------------------------------- Merge Join Merge Cond: (t1.a = t2.a) -> Sort Sort Key: t1.a -> Seq Scan on t1 Filter: (a > 2) -> Sort Sort Key: t2.a -> Seq Scan on t2 (9 rows) -- Best Regards Andy Fan (https://www.aliyun.com/)