hi,

I don't understand why the query planner is choosing a BitmapAnd when an
Index Scan followed by a filter is obviously better.

(Note that "new_york_houses" is a view of table "houses" with one
condition on city - and there is an index idx_houses_city. That is the
Index Scan that I think it should use exclusively.)

Here's a fast query that uses the Index Scan followed by a filter:

> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE 
> roof_area >= 0 AND roof_area < 278.7091;
>                                                                               
>  QUERY PLAN
> ---
>  Aggregate  (cost=167298.10..167298.11 rows=1 width=16) (actual 
> time=141.137..141.137 rows=1 loops=1)
>    ->  Index Scan using idx_houses_city on households  (cost=0.57..167178.87 
> rows=47694 width=16) (actual time=0.045..105.953 rows=53971 loops=1)
>          Index Cond: (city = 'New York'::text)
>          Filter: ((roof_area >= 0) AND ((roof_area)::numeric < 278.7091))
>          Rows Removed by Filter: 101719
>  Planning time: 0.688 ms
>  Execution time: 141.250 ms
> (7 rows)

When I add another condition, "phoneable", however, it chooses an
obviously wrong plan:

> => explain analyze SELECT COUNT(DISTINCT id) FROM new_york_houses WHERE 
> roof_area >= 0 AND roof_area < 278.7091 AND phoneable = true;
>                                                                               
>        QUERY PLAN
> ---
>  Aggregate  (cost=128163.05..128163.06 rows=1 width=16) (actual 
> time=4564.677..4564.677 rows=1 loops=1)
>    ->  Bitmap Heap Scan on households  (cost=105894.80..128147.78 rows=6106 
> width=16) (actual time=4456.690..4561.416 rows=5183 loops=1)
>          Recheck Cond: (city = 'New York'::text)
>          Filter: (phoneable AND (roof_area >= 0) AND ((roof_area)::numeric < 
> 278.7091))
>          Rows Removed by Filter: 40103
>          Heap Blocks: exact=14563
>          ->  BitmapAnd  (cost=105894.80..105894.80 rows=21002 width=0) 
> (actual time=4453.510..4453.510 rows=0 loops=1)
>                ->  Bitmap Index Scan on idx_houses_city  (cost=0.00..1666.90 
> rows=164044 width=0) (actual time=16.505..16.505 rows=155690 loops=1)
>                      Index Cond: (city = 'New York'::text)
>                ->  Bitmap Index Scan on idx_houses_phoneable  
> (cost=0.00..104224.60 rows=10271471 width=0) (actual time=4384.461..4384.461 
> rows=10647041 loops=1)
>                      Index Cond: (phoneable = true)
>  Planning time: 0.709 ms
>  Execution time: 4565.067 ms
> (13 rows)

On Postgres 9.4.4 with 244gb memory and SSDs

maintenance_work_mem 1000000
work_mem 500000
random_page_cost 1
seq_page_cost 2

The "houses" table has been analyzed recently and has statistics set to
the max.

Thanks,
Seamus


-- 
Seamus Abshere, SCEA
https://github.com/seamusabshere
https://www.linkedin.com/in/seamusabshere


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to