Hi hackers
Еhank you for your work.
Let me start my review from the top — specifically, in clausesel.c, the
function clauselist_selectivity_ext():
1. About check clauses == NULL. In my opinion, this check should be
kept. This issue has already been discussed previously[0], and I think
it's better to keep the safety check.
2. I noticed that the patch applies extended statistics to OR clauses as
well. There's an example from regression tests illustrating this:
Before applying ext stats:
SELECT * FROM check_estimated_rows('select * from join_test_1 j1 join
join_test_2 j2 on ((j1.a + 1 = j2.a + 1) or (j1.b = j2.b))');
estimated | actual
-----------+--------
104500 | 100000
After applying ext stats:
SELECT * FROM check_estimated_rows('select * from join_test_1 j1 join
join_test_2 j2 on ((j1.a + 1 = j2.a + 1) or (j1.b = j2.b))');
estimated | actual
-----------+--------
190000 | 100000
(1 row)
I agree that, at least for now, we should focus solely on AND clauses.
To do that, we should impose the same restriction in
clauselist_selectivity_or() as we already do in
clauselist_selectivity_ext().
What do you think? Or shall we consider OR-clauses as well?
[0]:
https://www.postgresql.org/message-id/flat/016e33b7-2830-4300-bc89-e7ce9e613bad%40tantorlabs.com
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.