Re: [GENERAL] Subplan and index usage
Vyacheslav Kalinin [EMAIL PROTECTED] writes: [ poor estimation for ] select * from cities where ( ficity_id in ( select ficity_id from cities_name_words where fsword like 'novgorod%' union select ficity_id from cities_name_ru_words where fsword like 'novgorod%' ) or lower(fsname) like 'novgorod%' or lower(fsname_ru) like 'novgorod%' ) When you have an IN at the top level of WHERE, it's flattened into a kind of JOIN, and there's reasonably decent estimation of the selectivity. Unfortunately, when it's down inside an OR-clause like this, all those smarts go out the window and it's just treated as a generic subplan condition, with 0.5 estimated selectivity IIRC. Improving that would be nice but it's not high on anyone's to-do list. You might get better results if you combine three separate queries with UNION. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Subplan and index usage
Consider the following case which is almost exact snapshot of part of our scheme: Table cities Column| Type | Modifiers | Description --++---+- ficity_id| integer| not null | ficountry_id | integer| | firegion_id | integer| | fsname | character varying(100) | | fsname_ru| character varying(200) | | Indexes: pk_geocities PRIMARY KEY, btree (ficity_id) idx_cities_name btree (lower(fsname::text) varchar_pattern_ops) idx_cities_name_ru btree (lower(fsname_ru::text) varchar_pattern_ops) idx_geocities_country_id btree (ficountry_id) idx_geocities_region_id btree (firegion_id) Foreign-key constraints: fk_geocities_country_id FOREIGN KEY (ficountry_id) REFERENCES countries(ficountry_id) ON UPDATE CASCADE ON DELETE CASCADE fk_geocities_region_id FOREIGN KEY (firegion_id) REFERENCES regions(firegion_id) ON UPDATE CASCADE ON DELETE CASCADE Table cities_name_words Column | Type | Modifiers | Description ---++---+- ficity_id | integer| not null | fsword| character varying(200) | not null | Indexes: idx_cities_name_words_city_id btree (ficity_id) idx_cities_name_words_word btree (fsword varchar_pattern_ops) Foreign-key constraints: fk_cities_name_words_city_id FOREIGN KEY (ficity_id) REFERENCES cities(ficity_id) ON UPDATE CASCADE ON DELETE CASCADE Table cities_name_ru_words Column | Type | Modifiers | Description ---++---+- ficity_id | integer| not null | fsword| character varying(200) | not null | Indexes: idx_cities_name_ru_words_city_id btree (ficity_id) idx_cities_name_ru_words_word btree (fsword varchar_pattern_ops) Foreign-key constraints: fk_cities_name_ru_words_city_id FOREIGN KEY (ficity_id) REFERENCES cities(ficity_id) ON UPDATE CASCADE ON DELETE CASCADE This is the part of geo location database. The purpose of cities_name_words and cities_name_ru_words is to facilitate indexing on separate words in city name - they contain words of fsname or fsname_ru respectively of the corresponding record in cities if it has more than word. Cities has about 19 records, cities_name_words about 8 and cities_name_ru_words about 5000. Now the query for city by name looks like this: select * from cities where ( ficity_id in ( select ficity_id from cities_name_words where fsword like 'novgorod%' union select ficity_id from cities_name_ru_words where fsword like 'novgorod%' ) or lower(fsname) like 'novgorod%' or lower(fsname_ru) like 'novgorod%' ) QUERY PLAN Seq Scan on cities (cost=16.63..5949.26 rows=95014 width=60) Filter: ((hashed subplan) OR (lower((fsname)::text) ~~ 'novgorod%'::text) OR (lower((fsname_ru)::text) ~~ 'novgorod%'::text)) SubPlan - Unique (cost=16.61..16.62 rows=2 width=4) - Sort (cost=16.61..16.62 rows=2 width=4) Sort Key: cities_name_words.ficity_id - Append (cost=0.00..16.60 rows=2 width=4) - Index Scan using idx_cities_name_words_word on cities_name_words (cost=0.00..8.31 rows=1 width=4) Index Cond: (((fsword)::text ~=~ 'novgorod'::text) AND ((fsword)::text ~~ 'novgoroe'::text)) Filter: ((fsword)::text ~~ 'novgorod%'::text) - Index Scan using idx_cities_name_ru_words_word on cities_name_ru_words (cost=0.00..8.27 rows=1 width=4) Index Cond: (((fsword)::text ~=~ 'novgorod'::text) AND ((fsword)::text ~~ 'novgoroe'::text)) Filter: ((fsword)::text ~~ 'novgorod%'::text) Notice how it uses proper indexes in subplan and goes for sequence scan on the main table. If the where- conditions are applied separately it uses indexes as expected: select * from cities where ( lower(fsname) like 'novgorod%' or lower(fsname_ru) like 'novgorod%' ) QUERY PLAN Bitmap Heap Scan on cities (cost=8.57..12.59 rows=1 width=60) Recheck Cond: ((lower((fsname)::text) ~~ 'novgorod%'::text) OR (lower((fsname_ru)::text) ~~ 'novgorod%'::text)) Filter: ((lower((fsname)::text) ~~ 'novgorod%'::text) OR (lower((fsname_ru)::text) ~~ 'novgorod%'::text)) - BitmapOr (cost=8.57..8.57 rows=1 width=0) - Bitmap Index Scan on idx_cities_name (cost=0.00..4.29 rows=1 width=0) Index Cond: ((lower((fsname)::text) ~=~ 'novgorod'::text) AND (lower((fsname)::text) ~~ 'novgoroe'::text)) - Bitmap Index Scan on idx_cities_name_ru (cost=0.00..4.28 rows=1 width=0)