Hi, hackers

I notice that postgres use constraints to optimize/rewrite queries in
limited cases as
https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-CONSTRAINT-EXCLUSION
.


I propose two new types of query rewrites using constraints here

1) Remove DISTINCT

A simple example is SELECT DISTINCT(name) FROM R. If there is a unique
constraint on the name column. The DISTINCT keyword can be removed safely.
Query plans without the DISTINCT keyword might be much cheaper since
DISTINCT is expensive.


2) Add LIMIT 1

An example of this optimization will be SELECT name from R WHERE name =
‘foo’. If there is a unique constraint on the name column, the selection
result has at most one record. Therefore, we can add LIMIT 1 safely. If the
original query plan performs a sequential scan on the R, adding LIMIT 1
might speed up the query because of the early return.


We designed an algorithm to decide if 1), 2) can be performed safely.
Rewriting queries manually and experimenting on a table with 10K records
shows 2X ~ 3X improvement for both rewrites. We have some other rewrite
rules, but the two are most obvious ones. With this feature, the optimizer
can consider the query plans both before and after the rewrite and choose
the one with minimum cost.


Will that feature be useful? How hard to implement the feature in the
current system? Any thoughts or comments are highly appreciated!


Best,

Lily

Reply via email to