Hi!
On 27.11.2024 16:17, Ravi wrote:
Please find the patch attached below for your review.
Thanks & Regards,
Ravi Revathy
Member Technical Staff
ZOHO Corporation
---- On Wed, 27 Nov 2024 18:41:13 +0530 *Ravi
<revath...@zohocorp.com>* wrote ---
Hi Developers,
Currently, PostgreSQL relies on table statistics, extracted
within the examine_simple_variable function, to estimate join
selectivity. However, when dealing with subqueries that include
GROUP BY clauses even for the single length clauses which result
in distinct rows, the planner often defaults to an assumption of
200 distinct rows. This leads to inaccurate cardinality
predictions, potentially resulting in suboptimal join plans.
*Problem Example*
Consider the following query:
explain select * from t1 left join (select a, max(b) from t2 group
by a) t2 on t1.a = t2.a;
The resulting plan predicts a high cardinality for the join, and
places the larger dataset on the hash side:
QUERY PLAN
--------------------------------------------------------------------------------
Hash Join (cost=943037.92..955323.45 rows=6963818 width=16)
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1 (cost=0.00..289.00 rows=20000 width=8)
-> Hash (cost=893538.50..893538.50 rows=3017074 width=8)
-> HashAggregate (cost=777429.49..893538.50 rows=3017074
width=8)
Group Key: t2.a
Planned Partitions: 64
-> Seq Scan on t2 (cost=0.00..158673.98
rows=11000098 width=8)
(8 rows)
Here, the join cardinality is overestimated, and table t2 with
larger dataset being placed on the hash side, despite t1 having
fewer rows.
*Proposed Solution:*
In subqueries with a GROUP BY clause that has a single grouping
column, it is reasonable to assume the result set contains unique
values for that column.
By taking this assumption, we can consider the output of the
aggregate node as unique and instead of assuming a default
distinct row count (200), we should derive the estimate from the
HashAggregate node’s row count.
Execution Plan after the patch applied:
QUERY PLAN
--------------------------------------------------------------------------
Hash Join (cost=777968.49..935762.27 rows=20000 width=16)
Hash Cond: (t2.a = t1.a)
-> HashAggregate (cost=777429.49..893538.50 rows=3017074 width=8)
Group Key: t2.a
Planned Partitions: 64
-> Seq Scan on t2 (cost=0.00..158673.98 rows=11000098
width=8)
-> Hash (cost=289.00..289.00 rows=20000 width=8)
-> Seq Scan on t1 (cost=0.00..289.00 rows=20000 width=8)
(8 rows)
Can you confirm if my assumption about leveraging the distinct row
property of a GROUP BY clause with a single grouping column for
improving join cardinality estimation is valid? If not, I would
appreciate suggestions or corrections regarding this approach.
maybeIrealizedsomethingwaswrong,butI didn'tseea
problemwithcardinalitywhenI took outthe problemlikethis:
alena@postgres=# drop table t1; DROP TABLE alena@postgres=# drop table
t2; DROP TABLE alena@postgres=# create table t1 (a int); CREATE TABLE
alena@postgres=# create table t1 (x int); ERROR: relation "t1" already
exists alena@postgres=# create table t2 (a int, b int); CREATE TABLE
alena@postgres=# insert into t1 select id from generate_series(1,1000)
as id; INSERT 0 1000 alena@postgres=# insert into t2 select id, id%10
from generate_series(991,1900) as id; INSERT 0 910 alena@postgres=#
analyze; ANALYZE alena@postgres=# explain analyze select * from t1 left
join (select a, max(b) from t2 group by a) t2 on t1.a = t2.a; QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=39.12..56.76 rows=1000 width=12) (actual
time=2.024..2.731 rows=1000 loops=1) Hash Cond: (t1.a = t2.a) -> Seq
Scan on t1 (cost=0.00..15.00 rows=1000 width=4) (actual
time=0.030..0.259 rows=1000 loops=1) -> Hash (cost=27.75..27.75 rows=910
width=8) (actual time=1.986..1.987 rows=910 loops=1) Buckets: 1024
Batches: 1 Memory Usage: 44kB -> HashAggregate (cost=18.65..27.75
rows=910 width=8) (actual time=1.162..1.586 rows=910 loops=1) Group Key:
t2.a Batches: 1 Memory Usage: 169kB -> Seq Scan on t2 (cost=0.00..14.10
rows=910 width=8) (actual time=0.018..0.239 rows=910 loops=1) Planning
Time: 0.215 ms Execution Time: 2.926 ms (11 rows)
Cardinality is predicted correctly as I see. I'm missing something?
--
Regards,
Alena Rybakina
Postgres Professional