Hi,
Thanks for taking a look.

> On 30 Sep 2024, at 14:14, Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote:
> 
> Hi Michal,
> It is difficult to understand the exact problem from your description.
> Can you please provide EXPLAIN outputs showing the expected plan and
> the unexpected plan; plans on the node where the query is run and
> where the partitions are located.

The table structure is as follows:

CREATE TABLE tbl (…) PARTITION BY RANGE year(col02_date)

CREATE TABLE tbl_2015 PARTITION OF tbl FOR VALUES BETWEEN (2023) AND (2024) 
PARTITION BY HASH (col01_no)
… subsequent years


CREATE TABLE tbl_2021_32_9 PARTITION OF tbl_2021 FOR VALUES WITH (MODULUS 32 
REMAINDER 9)
…

CREATE FOREIGN TABLE tbl_2022_16_9 PARTITION OF tbl_2022 FOR VALUES WITH 
(MODULUS 32 REMAINDER 9)

All tables are freshly ANALYZEd.

I have a function:

CREATE FUNCTION report(col01 text, year_from, month_from, year_to, month_to) 
RETURNS … LANGUAGE sql
$$
SELECT
  col01_no, year(col02_date), month(col02_date), sum(col03) FROM tbl WHERE 
col02_no = col02 AND (col02_date conditions) GROUP BY 1, 2, 3
$$

EXPLAIN (…) SELECT * FROM report(…);

gives

Plan 1 (performs pushdown of aggregates):
 
Append  (cost=9.33..76322501.41 rows=3 width=58) (actual time=5.051..6.414 
rows=21 loops=1)
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 
rows=1 loops=1)
          Output: '2021-01-01'::date
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 
rows=1 loops=1)
          Output: '2023-12-31'::date
  ->  GroupAggregate  (cost=9.31..9.85 rows=1 width=58) (actual 
time=0.073..0.074 rows=0 loops=1)
        Output: op.col01_no, (year(op.col02_date)), (month(op.col02_date)), 
sum(op.col03) FILTER (WHERE (op.debit_flag = '0'::bpchar)), sum(op.col03) 
FILTER (WHERE (op.debit_flag <> '0'::bpchar)), count(1)
        Group Key: (year(op.col02_date)), (month(op.col02_date))
        ->  Sort  (cost=9.31..9.32 rows=1 width=44) (actual time=0.072..0.073 
rows=0 loops=1)
              Output: (year(op.col02_date)), (month(op.col02_date)), 
op.col01_no, op.col03, op.debit_flag
              Sort Key: (year(op.col02_date)), (month(op.col02_date))
              Sort Method: quicksort  Memory: 25kB
              ->  Index Only Scan using 
tbl_2021_32_9_universal_gist_idx_3a2df25af5bc48a on cbt.tbl_2021_32_9 op  
(cost=0.28..9.30 rows=1 width=44) (actual time=0.063..0.063 rows=0 loops=1)
                    Output: year(op.col02_date), month(op.col02_date), 
op.col01_no, op.col03, op.debit_flag
                    Index Cond: ((op.col01_no = 
'22109020660000000110831697'::text) AND (op.col02_date >= $0) AND 
(op.col02_date <= $1))
                    Filter: ((year(op.col02_date) >= 2021) AND 
(year(op.col02_date) <= 2023))
                    Heap Fetches: 0
  ->  Async Foreign Scan  (cost=100.02..76322480.36 rows=1 width=58) (actual 
time=0.753..0.755 rows=11 loops=1)
        Output: op_1.col01_no, (year(op_1.col02_date)), 
(month(op_1.col02_date)), (sum(op_1.col03) FILTER (WHERE (op_1.debit_flag = 
'0'::bpchar))), (sum(op_1.col03) FILTER (WHERE (op_1.debit_flag <> 
'0'::bpchar))), ((count(1))::double precision)
        Relations: Aggregate on 
(cbt_c61d467d1b5fd1d218d9e6e7dd44a333.tbl_2022_16_9 op_1)
        Remote SQL: SELECT col01_no, cbt.year(col02_date), 
cbt.month(col02_date), sum(col03) FILTER (WHERE (debit_flag = '0')), sum(col03) 
FILTER (WHERE (debit_flag <> '0')), count(1) FROM cbt.tbl_2022_16_9 WHERE 
((cbt.year(col02_date) >= 2021)) AND ((cbt.year(col02_date) <= 2023)) AND 
((col02_date >= $1::date)) AND ((col02_date <= $2::date)) AND ((col01_no = 
'22109020660000000110831697')) GROUP BY 1, 2, 3
  ->  GroupAggregate  (cost=10.63..11.17 rows=1 width=58) (actual 
time=4.266..4.423 rows=10 loops=1)
        Output: op_2.col01_no, (year(op_2.col02_date)), 
(month(op_2.col02_date)), sum(op_2.col03) FILTER (WHERE (op_2.debit_flag = 
'0'::bpchar)), sum(op_2.col03) FILTER (WHERE (op_2.debit_flag <> '0'::bpchar)), 
count(1)
        Group Key: (year(op_2.col02_date)), (month(op_2.col02_date))
        ->  Sort  (cost=10.63..10.64 rows=1 width=44) (actual time=4.238..4.273 
rows=735 loops=1)
              Output: (year(op_2.col02_date)), (month(op_2.col02_date)), 
op_2.col01_no, op_2.col03, op_2.debit_flag
              Sort Key: (year(op_2.col02_date)), (month(op_2.col02_date))
              Sort Method: quicksort  Memory: 82kB
              ->  Index Only Scan using 
tbl_2023_128_9_universal_gist_idx_3a2df25af5bc48a on cbt.tbl_2023_128_9 op_2  
(cost=0.54..10.62 rows=1 width=44) (actual time=0.295..4.059 rows=735 loops=1)
                    Output: year(op_2.col02_date), month(op_2.col02_date), 
op_2.col01_no, op_2.col03, op_2.debit_flag
                    Index Cond: ((op_2.col01_no = 
'22109020660000000110831697'::text) AND (op_2.col02_date >= $0) AND 
(op_2.col02_date <= $1))
                    Filter: ((year(op_2.col02_date) >= 2021) AND 
(year(op_2.col02_date) <= 2023))
                    Heap Fetches: 0
 
 

BUT after I perform
CREATE OR REPLACE report(…)
STRICT TO … AS … (same code)

The plan (as reported by auto_explain) changes to:

Plan 2 (no pushdown):

GroupAggregate  (cost=1781983216.68..1781983324.62 rows=200 width=58) (actual 
time=16.065..16.432 rows=21 loops=1)
  Output: op.col01_no, (year(op.col02_date)), (month(op.col02_date)), 
sum(op.col03) FILTER (WHERE (op.debit_flag = '0'::bpchar)), sum(op.col03) 
FILTER (WHERE (op.debit_flag <> '0'::bpchar)), count(1)
  Group Key: (year(op.col02_date)), (month(op.col02_date))
  InitPlan 1 (returns $0)
    ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.002 
rows=1 loops=1)
          Output: make_date($2, $3, 1)
  InitPlan 2 (returns $1)
    ->  Result  (cost=0.00..0.02 rows=1 width=4) (actual time=0.002..0.003 
rows=1 loops=1)
          Output: (((make_date($4, $5, 1) + '1 mon'::interval) - '1 
day'::interval))::date
  ->  Sort  (cost=1781983216.65..1781983217.33 rows=272 width=44) (actual 
time=16.041..16.117 rows=1564 loops=1)
        Output: (year(op.col02_date)), (month(op.col02_date)), op.col01_no, 
op.col03, op.debit_flag
        Sort Key: (year(op.col02_date)), (month(op.col02_date))
        Sort Method: quicksort  Memory: 171kB
        ->  Append  (cost=0.55..1781983205.65 rows=272 width=44) (actual 
time=1.013..15.445 rows=1564 loops=1)
              Subplans Removed: 269
              ->  Index Only Scan using 
accoper_2021_32_9_universal_gist_idx_3a2df25af5bc48a on cbt.accoper_2021_32_9 
op_1  (cost=0.28..9.30 rows=1 width=44) (actual time=0.084..0.084 rows=0 
loops=1)
                    Output: year(op_1.col02_date), month(op_1.col02_date), 
op_1.col01_no, op_1.col03, op_1.debit_flag
                    Index Cond: ((op_1.col01_no = $1) AND (op_1.col02_date >= 
$0) AND (op_1.col02_date <= $1))
                    Filter: ((year(op_1.col02_date) >= $2) AND 
(year(op_1.col02_date) <= $4))
                    Heap Fetches: 0
              ->  Async Foreign Scan on 
cbt_c61d467d1b5fd1d218d9e6e7dd44a333.accoper_2022_16_9 op_2  
(cost=100.00..76322479.83 rows=1 width=44) (actual time=0.658..3.870 rows=829 
loops=1)
                    Output: year(op_2.col02_date), month(op_2.col02_date), 
op_2.col01_no, op_2.col03, op_2.debit_flag
                    Remote SQL: SELECT col03, col02_date, debit_flag, col01_no 
FROM cbt.accoper_2022_16_9 WHERE ((col02_date >= $1::date)) AND ((col02_date <= 
$2::date)) AND ((col01_no = $3::text)) AND ((cbt.year(col02_date) >= 
$4::integer)) AND ((cbt.year(col02_date) <= $5::integer))
              ->  Index Only Scan using 
accoper_2023_128_9_universal_gist_idx_3a2df25af5bc48a on cbt.accoper_2023_128_9 
op_3  (cost=0.54..10.62 rows=1 width=44) (actual time=0.361..4.043 rows=735 
loops=1)
                    Output: year(op_3.col02_date), month(op_3.col02_date), 
op_3.col01_no, op_3.col03, op_3.debit_flag
                    Index Cond: ((op_3.col01_no = $1) AND (op_3.col02_date >= 
$0) AND (op_3.col02_date <= $1))
                    Filter: ((year(op_3.col02_date) >= $2) AND 
(year(op_3.col02_date) <= $4))
                    Heap Fetches: 0

I understand wrong rows estimates ( =1 ) are due to missing statistics on 
expressions year() and month().

But why plans are different?


BTW. I tried to add extended statistics on the above expressions but ANALYZE 
didn’t seem to update any values (as seen in pg_stat_ext) - ndistinct is NULL 
for example.

Thanks,

--
Michal


Reply via email to