Le vendredi 13 mai 2022, 07:07:47 CEST Andrey Lepikhov a écrit :
> New version of the feature.
> Here a minor bug with RowMarks is fixed. A degenerated case is fixed,
> when uniqueness of an inner deduced not from join quals, but from a
> baserestrictinfo clauses 'x=const', where x - unique field.
> Code, dedicated to solve second issue is controversial, so i attached
> delta.txt for quick observation.
> Maybe we should return to previous version of code, when we didn't split
> restriction list into join quals and base quals?

Hello,

I tried to find problematic cases, which would make the planning time grow 
unacceptably, and couldn't devise it.

The worst case scenario I could think of was as follows:

 - a query with many different self joins
 - an abundance of unique indexes on combinations of this table columns to 
consider
 - additional predicates on the where clause on columns.

The base table I used for this was a table with 40 integers. 39 unique indexes 
were defined on every combination of (c1, cX) with cX being columns c2 to c40.

I turned geqo off, set from_collapse_limit and join_collapse_limit to 
unreasonably high values (30), and tried to run queries of the form:

SELECT * FROM test_table t1
JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX
...
JOIN test_table tX ON t1.c1 = tX.c1 AND t1.c[X+2] = tX.cX.

So no self join can be eliminated in that case.
The performance was very similar with or without the GUC enabled. I tested the 
same thing without the patch, since the test for uniqueness has been slightly 
altered and incurs a new allocation, but it doesn't seem to change. 

One interesting side effect of this patch, is that removing any unneeded self 
join cuts down the planification time very significantly, as we lower the 
number 
of combinations to consider. 

As for the code: 

 - Comments on relation_has_unique_index_ext and relation_has_unique_index_for 
should be rewritten, as relation_has_unique_index_for is now just a special 
case of relation_has_unique_index_ext. By the way, the comment would probably 
be better read as: "but if extra_clauses isn't NULL".
 - The whole thing about "extra_clauses", ie, baserestrictinfos which were 
used to determine uniqueness, is not very clear. Most functions where the new 
argument has been added have not seen an update in their comments, and the 
name itself doesn't really convey the intented meaning: perhaps 
required_non_join_clauses ?

The way this works should be explained a bit more thoroughly, for example in 
remove_self_joins_one_group the purpose of uclauses should be explained. The 
fact that degenerate_case returns true when we don't have any additional base 
restrict info is also confusing, as well as the degenerate_case name.

I'll update if I think of more interesting things to add. 

-- 
Ronan Dunklau




Reply via email to