Re: Extend more usecase for planning time partition pruning and init partition pruning.

Andy Fan Fri, 05 Mar 2021 15:02:52 -0800

Hi Amit:
  Thanks for your review!

On Thu, Mar 4, 2021 at 5:07 PM Amit Langote <amitlangot...@gmail.com> wrote:


> Hi Andy,
>
> On Sun, Jan 24, 2021 at 7:34 PM Andy Fan <zhihui.fan1...@gmail.com> wrote:
> >  I recently found a use case like this.  SELECT * FROM p, q WHERE
> p.partkey =
> >  q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either
> planning time
> >  partition prune or init partition prune.  Even though we have run-time
> >  partition pruning work at last, it is too late in some cases since we
> have
> >  to init all the plan nodes in advance.  In my case, there are 10+
> >  partitioned relation in one query and the execution time is short, so
> the
> >  init plan a lot of plan nodes cares a lot.
> >
> > The attached patches fix this issue. It just get the "p.partkey = q.colx"
> > case in root->eq_classes or rel->joinlist (outer join), and then check
> if there
> > is some baserestrictinfo in another relation which can be used for
> partition
> > pruning. To make the things easier, both partkey and colx must be Var
> > expression in implementation.
> >
> > - v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch
> >
> > Just some existing refactoring and extending ChangeVarNodes to be able
> > to change var->attno.
> >
> > - v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
>
> IIUC, your proposal is to transpose the "q.b in (1, 2)" in the
> following query as "p.a in (1, 2)" and pass it down as a pruning qual
> for p:
>
> select * from p, q where p.a = q.b and q.b in (1, 2);
>
> or "(q.b = 1 or q.b = 2)" in the following query as "(p.a = 1 or p.a = 2)":
>
> select * from p, q where p.a = q.b and (q.b = 1 or q.b = 2);
>
>
Yes,  you understand me correctly.


> While that transposition sounds *roughly* valid, I have some questions
> about the approach:
>
> * If the transposed quals are assumed valid to use for partition
> pruning, could they also not be used by, say, the surviving
> partitions' index scan paths?  So, perhaps, it doesn't seem right that
> partprune.c builds the clauses on-the-fly for pruning and dump them
> once done.
>
> * On that last part, I wonder if partprune.c isn't the wrong place to
> determine that "q.b in (1, 2)" and "p.a in (1, 2)" are in fact
> equivalent.  That sort of thing is normally done in the phase of
> planning when distribute_qual_to_rels() runs and any equivalences
> found stored in PlannerInfo.eq_classes.  Have you investigated why the
> process_ machinery doesn't support working with ScalarArrayOpExpr and
> BoolExpr to begin with?
>
> * Or maybe have you considered generalizing what
> build_implied_pruning_quals() does so that other places like
> indxpath.c can use the facility?
>
>
Actually at the beginning of this work, I do think I should put the implied
quals to baserestictinfo in the distribute_qual_for_rels stage.  That
probably
can fix all the issues you reported.  However that probably more complex
than what I did with more risks and I have a very limited timeline to handle
the real custom issue,  so I choose this strategy.   But it is the time to
re-think
the baserestrictinfo way now.  I will spend some time in this direction,
Thank you
 for this kind  of push-up:)  I just checked this stuff on Oracle,  Oracle
does use
this strategy.

SQL> explain plan for select * from t1, t2 where t1.a = t2.a and t1.a > 2;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1838229974

---------------------------------------------------------------------------
| Id  | Operation   | Name | Rows  | Bytes | Cost (%CPU)| Time  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |  | 1 |    52 | 4   (0)| 00:00:01 |
|*  1 |  HASH JOIN   |  | 1 |    52 | 4   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   | 1 |    26 | 2   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   | 1 |    26 | 2   (0)| 00:00:01 |
---------------------------------------------------------------------------


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."A"="T2"."A")

*   2 - filter("T1"."A">2)   3 - filter("T2"."A">2)*

17 rows selected.


postgres=# explain (costs off) select * from t1, t2 where t1.a = t2.a and
t1.a > 2;
          QUERY PLAN
-------------------------------
 Merge Join
   Merge Cond: (t1.a = t2.a)
   ->  Sort
         Sort Key: t1.a
         ->  Seq Scan on t1
               Filter: (a > 2)
   ->  Sort
         Sort Key: t2.a
         ->  Seq Scan on t2
(9 rows)


-- 
Best Regards
Andy Fan (https://www.aliyun.com/)

Re: Extend more usecase for planning time partition pruning and init partition pruning.

Reply via email to