I have been working on implementing the runtime partition pruning which would increase the performance of queries involving partitioned table to a great extent.
PFA the POC which can be applied over Amit's patch for faster partition pruning [1] and Dilip's refactor patch [2] on commit 2c74e6c1dcc5002fa8b822e5757f6c95d899fb7a. [1] https://www.postgresql.org/message-id/e02923ea-a117-a6ad-6a3e-ea5e1ba41ece%40lab.ntt.co.jp [2] https://www.postgresql.org/message-id/CAFiTN-tGnQzF_4QtbOHT-3hE%3DOvNaMfbbeRxa4UY0CQyF0G8gQ%40mail.gmail.com There were a couple of things that need improvement/opinion: In get_rel_partition_info, we store minop and maxop for each partition key. For the equality case, which is most common, both would store the same value. We could make it better by storing equal (bound, bound, ....) instead repeating the same values. get_partitions_for_keys currently returns the list of partitions valid for the given keys but for a table with many partitions this list would be very long so maybe for range qual ( key > a & key < b ) we could only store the min and max partition number and increment as_whichplan by 1 till we reach max partition number. For non-continuous partitions, we would still need the list. Currently, the partitions numbers are recalculated whenever the ChgParam is set, This can be optimised by skipping this step when only a non-partition key column has changed; reusing the existing partitions selected. Others: - better handling of multiple key - allow use of expression in the quals. - To use min_incl, max_incl properly in get_partitions_for_keys. - pruning during function calls. Currently with patch, during NestLoop: Nested Loop -> SeqScan tbl1 -> Append -> Index Scan p01 -> Index Scan p02 -> Index Scan p03 For each tuple from tbl1, only the relevant partition (p01or p02 or p03) will be scanned. --- Prepared Statement Behaviour with patch--- Table Descritpion: Table "public.tprt" Column | Type | Collation | Nullable | Default | Storage | Stats target | Description --------+---------+-----------+----------+---------+---------+--------------+------------- col1 | integer | | | | plain | | col2 | integer | | | | plain | | Partition key: RANGE (col1) Partitions: tprt_1 FOR VALUES FROM (1) TO (50001), tprt_2 FOR VALUES FROM (50001) TO (100001), tprt_3 FOR VALUES FROM (100001) TO (200001) EXPLAIN EXECUTE prstmt_select(15); QUERY PLAN ------------------------------------------------------------------ Append (cost=0.00..1736.55 rows=1 width=8) -> Seq Scan on tprt_1 (cost=0.00..849.15 rows=16724 width=8) Filter: (col1 < $1) (3 rows) EXPLAIN EXECUTE prstmt_select(60000); QUERY PLAN ------------------------------------------------------------------ Append (cost=0.00..1736.55 rows=2 width=8) -> Seq Scan on tprt_1 (cost=0.00..849.15 rows=16724 width=8) Filter: (col1 < $1) -> Seq Scan on tprt_2 (cost=0.00..849.15 rows=16724 width=8) Filter: (col1 < $1) (5 rows) -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
0001-POC-Implement-runtime-partiton-pruning.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers