[HACKERS] Select queries which violates table constrains

Joni Martikainen Mon, 12 May 2014 07:29:43 -0700

Hi,

I investigated some select query performance issues and noticed thatpostgresql misses some obvious cases while processing SELECT query. Imean the case where WHERE clause contains statement which conditionwould be against table structure. (excuse my language, look the code)


Example:
Let the table be :

CREATE TABLE test
(
  id numeric(3,0) NOT NULL,
  somecolumn numeric(5,0) NOT NULL,
  CONSTRAINT id_pk PRIMARY KEY (id)
);

Simple table with "somecolumn" column which has constraint NOT NULL.

Let's do a following query to the table.

SELECT somecolumn FROM test WHERE somecolumn IS NULL;

Result is empty result set which is obvious because any null value wouldbe against the table constrain.The thing here is that postgresql does SeqScan to this table in order tofind out if there is any null values.


Explain:
"Seq Scan on test  (cost=0.00..1.06 rows=1 width=5)"
"  Filter: (somecolumn IS NULL)"
"Planning time: 0.778 ms"

SeqScan can be avoided by making index for "somecolumn" and indexing allthe null values. That index would be empty and very fast but also verypointless since table constraint here is simple.No one would do such a query in real life but some programmaticallygenerated queries does this kind of things. Only way I found to goaround this problem was to create those empty indexies but I think thequery optimizer could be smarter here.

I took a look of the optimizer code and I didn't find any code whichavoids this kind of situations. (I expect that it would be optimizer'stask to find out this kind of things)

I was thinking some feature for optimizer where the optimizer could adda hint for an executor if some query plan path leads to the empty resultset case. If executor sees this hint it could avoid doing seqscan andactually even index scans. This kind of query constraint vs. tableconstraint comparison should be anyway cheaper process to execute thanseqscan.

The question is that, is there any reason why such an optimization phasecould not be implemented? Another question is that how is the queryengine handling the partitioned table case? Am i right that tablepartitions are solved by table constrains and indexies are used tovalidate which child table to look for? And so forth could this kind ofnew optimization phase benefit partitioned tables?



Kind regards
Joni Martikainen



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Select queries which violates table constrains

Reply via email to