Hi, We have table created like this:
CREATE TABLE xyz AS SELECT generate_series(1,10000000,1) AS gs; Now: db=# explain analyze select * from xyz where gs&1=1; QUERY PLAN ---------------------------------------------------------------------------- ----------------------------------- Seq Scan on xyz (cost=0.00..260815.38 rows=68920 width=4) (actual time=0.044..2959.728 rows=5000000 loops=1) Filter: ((gs & 1) = 1) Rows Removed by Filter: 5000000 Planning time: 0.133 ms Execution time: 3340.886 ms (5 rows) And after adding additional clause to WHERE: db=# explain analyze select * from xyz where gs&1=1 and gs&2=2; QUERY PLAN ---------------------------------------------------------------------------- --------------------------------- Seq Scan on xyz (cost=0.00..329735.50 rows=345 width=4) (actual time=0.045..3010.430 rows=2500000 loops=1) Filter: (((gs & 1) = 1) AND ((gs & 2) = 2)) Rows Removed by Filter: 7500000 Planning time: 0.106 ms Execution time: 3176.355 ms (5 rows) And one more clause: newrr=# explain analyze select * from xyz where gs&1=1 and gs&2=2 and gs&4=4; QUERY PLAN ---------------------------------------------------------------------------- ------------------------------- Seq Scan on xyz (cost=0.00..398655.62 rows=2 width=4) (actual time=0.052..3329.422 rows=1250000 loops=1) Filter: (((gs & 1) = 1) AND ((gs & 2) = 2) AND ((gs & 4) = 4)) Rows Removed by Filter: 8750000 Planning time: 0.119 ms Execution time: 3415.839 ms (5 rows) As we can see estimates differs significally from the actual records count - only three clauses are reducing estimated number of records from 10000000 to 2. I noticed that each additional clause reduces the number about 200 times and define DEFAULT_NUM_DISTINCT is responsible for this behaviur. I think that this variable should be lower or maybe estimation using DEFAULT_NUM_DISTTINCT should be done once per table. Artur Zajac