Re: force partition pruning

Vijaykumar Jain Mon, 10 May 2021 05:47:30 -0700

I do not know how to put this in words,
but see below when the predicate is explicitly applied to the main table
with partition.


postgres=# \d+ prt1
                                   Partitioned table "public.prt1"
 Column |       Type        | Collation | Nullable | Default | Storage  |
Stats target | Description
--------+-------------------+-----------+----------+---------+----------+--------------+-------------
 a      | integer           |           | not null |         | plain    |
            |
 b      | integer           |           |          |         | plain    |
            |
 c      | character varying |           |          |         | extended |
            |
Partition key: RANGE (a)
Partitions: prt1_p1 FOR VALUES FROM (0) TO (250),
            prt1_p2 FOR VALUES FROM (250) TO (500),
            prt1_p3 FOR VALUES FROM (500) TO (600)

(failed reverse-i-search)`': ^C
postgres=# \d+ b
                                     Table "public.b"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target
| Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 id     | integer |           | not null |         | plain   |
|
Indexes:
    "b_id_idx" btree (id)
Access method: heap

postgres=# table b;
 id
-----
 200
 400
(2 rows)

-- basically if the table is joined and predicate can be applied to the
outer table which has constraints matching,
partition pruning takes place.

I do not know the theory, or even what i did is correct, but just FYI.

postgres=# explain analyze  select prt1.* from prt1 where a in ( select id
from b where id in (1, 100, 200) );
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=1.05..9.36 rows=2 width=13) (actual
time=0.034..0.074 rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Append  (cost=0.00..7.50 rows=300 width=13) (actual
time=0.006..0.043 rows=300 loops=1)
         ->  Seq Scan on prt1_p1 prt1_1  (cost=0.00..2.25 rows=125
width=13) (actual time=0.005..0.013 rows=125 loops=1)
         ->  Seq Scan on prt1_p2 prt1_2  (cost=0.00..2.25 rows=125
width=13) (actual time=0.003..0.009 rows=125 loops=1)
         ->  Seq Scan on prt1_p3 prt1_3  (cost=0.00..1.50 rows=50 width=13)
(actual time=0.002..0.004 rows=50 loops=1)
   ->  Hash  (cost=1.03..1.03 rows=2 width=4) (actual time=0.005..0.005
rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b  (cost=0.00..1.03 rows=2 width=4) (actual
time=0.003..0.003 rows=1 loops=1)
               Filter: (id = ANY ('{1,100,200}'::integer[]))
               Rows Removed by Filter: 1
 Planning Time: 0.181 ms
 Execution Time: 0.089 ms
(13 rows)

postgres=# explain analyze  select prt1.* from prt1 where a in ( select id
from b where b.id = prt1.a) and a in (1, 100, 200);
                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Index Scan using iprt1_p1_a on prt1_p1 prt1  (cost=0.14..14.03 rows=2
width=13) (actual time=0.024..0.025 rows=1 loops=1)
   Index Cond: (a = ANY ('{1,100,200}'::integer[]))
   Filter: (SubPlan 1)
   Rows Removed by Filter: 1
   SubPlan 1
     ->  Seq Scan on b  (cost=0.00..1.02 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=2)
           Filter: (id = prt1.a)
           Rows Removed by Filter: 1
 Planning Time: 0.120 ms
 Execution Time: 0.041 ms
(10 rows)

postgres=# explain analyze  select prt1.* from prt1 where exists ( select 1
from b where b.id = prt1.a) and a in (1, 100, 200);
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Hash Semi Join  (cost=1.04..3.79 rows=2 width=13) (actual
time=0.024..0.028 rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Seq Scan on prt1_p1 prt1  (cost=0.00..2.72 rows=3 width=13) (actual
time=0.011..0.017 rows=2 loops=1)
         Filter: (a = ANY ('{1,100,200}'::integer[]))
         Rows Removed by Filter: 123
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.004..0.004
rows=2 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.001..0.002 rows=2 loops=1)
 Planning Time: 0.192 ms
 Execution Time: 0.043 ms
(10 rows)

postgres=# explain analyze  select prt1.* from prt1 inner join b on prt1.a
= b.id  where a in (1, 100, 200);
                                                 QUERY PLAN
-------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1.04..3.79 rows=2 width=13) (actual time=0.024..0.028
rows=1 loops=1)
   Hash Cond: (prt1.a = b.id)
   ->  Seq Scan on prt1_p1 prt1  (cost=0.00..2.72 rows=3 width=13) (actual
time=0.012..0.018 rows=2 loops=1)
         Filter: (a = ANY ('{1,100,200}'::integer[]))
         Rows Removed by Filter: 123
   ->  Hash  (cost=1.02..1.02 rows=2 width=4) (actual time=0.003..0.004
rows=2 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 9kB
         ->  Seq Scan on b  (cost=0.00..1.02 rows=2 width=4) (actual
time=0.001..0.002 rows=2 loops=1)
 Planning Time: 0.181 ms
 Execution Time: 0.043 ms
(10 rows)



On Mon, 10 May 2021 at 17:09, Niels Jespersen <n...@dst.dk> wrote:

> Hi all
>
>
>
> I need a litte advice on how to
>
>
>
> Postgres 13.2
>
>
>
> A metadata query pulls partition keys:
>
>
>
> select m.period_version from register.register_metadata m where
> m.current_version and m.period between '201712' and '201912' ;
>
>
>
> A query using these in an in-list easily makes the planner do partition
> pruning.
>
>
>
> select * from register.register d where d.period_version in ('201712_1',
> '201812_1', '201912_1');
>
>
>
> However combining the metadataquery into the dataquery makes the planner
> decide to scan all partitions.
>
>
>
> select * from register.register d where d.period_version in (select
> m.period_version from register.register_metadata m where m.current_version
> and m.period between '201712' and '201912');
>
>
>
> I am quite aware that the latter query requires partition pruning to take
> place during execution not during planning.
>
>
>
> My question here is how do I package the two-step proces into an interface
> that analysts can actually use?
>
>
>
> One possibility is to have a prepare step that creates a temporary view
> with the hard-coded values built-in. And then query data via the temp view.
> This works ok, but there is an issue with possible naming conflicts on the
> temp view (not that this could not be worked around).
>
>
>
> Ideally I would like a function to figure out the query and then return
> the data from that dynamically executed query. Complicating matters is the
> fact that there are more than one set of data/metatable tables and each
> datatable has a different set of columns. This excludes a table returning
> function since that must list the columns present.
>
>
>
>
>
> Best regards
>
>
>
>
>
> *Niels Jespersen*
>
> Chief Adviser
>
> IT Center
>
> Mobile phone:+45 42 42 93 73
> Email: n...@dst.dk
>
> Statistics Denmark, Sejrøgade 11, DK-2100 Copenhagen
>
> www.dst.dk/en | Twitter <https://twitter.com/dstdk> | LinkedIn
> <https://www.linkedin.com/company/statistics-denmark/> | Facebook
> <https://www.facebook.com/danmarksstatistik>
>
>
>
>
>
>
>
>
>
>
>
>
>


-- 
Thanks,
Vijay
Mumbai, India

Re: force partition pruning

Reply via email to