Hi all,

I've been trying to apply pg_tgrm for the search-function of my
application. The database fits a few times in the available RAM, and is
mostly read-only.
Plans, schema and configs in attachment. Postgresql version 9.1.4 on Debian.

When just searching in one table, it behaves perfectly here. When I put
constraints on multiple connected tables (performance and performer), it
takes some bad decisions. Somehow the planner thinks that an index scan on
a trigram index (on a string) is as fast as an index scan on a btree of an
int. Because of that, it will combine both index scans into an "AND" bitmap
index scan. Since this is done in a nested loop, the performance gets very
bad. The trigram index scan should not be repeated as it is relatively slow
and always the same query.

When I disable bitmap scans, it will search on both tables and then hash
everything together. This avoids launching the same index scan over and
over again. This is much faster.

Since my database is mostly in memory, I guess I could safely disable
bitmap scan (or at least for some query), since I understand that this kind
of scan is often a way to have a better IO performance. There's little IO
in my setup.
However, I'd rather get some help in fixing it right!

Thanks,

Mathieu
shs-dev=# explain analyze select e.id, e.title from performance e join 
performer p on e.performer_id = p.id WHERE e.title_ ~~ '%beatles%' AND p.name_ 
~~ '%pepper%';
                                                                     QUERY PLAN 
                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=13.43..88.78 rows=1 width=22) (actual time=352.961..352.961 
rows=0 loops=1)
   ->  Bitmap Heap Scan on performer p  (cost=4.32..23.09 rows=5 width=4) 
(actual time=3.276..3.419 rows=25 loops=1)
         Recheck Cond: (name_ ~~ '%pepper%'::text)
         ->  Bitmap Index Scan on performer_name__trgm_idx  (cost=0.00..4.31 
rows=5 width=0) (actual time=3.268..3.268 rows=25 loops=1)
               Index Cond: (name_ ~~ '%pepper%'::text)
   ->  Bitmap Heap Scan on performance e  (cost=9.11..13.12 rows=1 width=26) 
(actual time=13.978..13.978 rows=0 loops=25)
         Recheck Cond: ((performer_id = p.id) AND (title_ ~~ '%beatles%'::text))
         ->  BitmapAnd  (cost=9.11..9.11 rows=1 width=0) (actual 
time=13.975..13.975 rows=0 loops=25)
               ->  Bitmap Index Scan on performance_performer_idx  
(cost=0.00..4.35 rows=10 width=0) (actual time=0.008..0.008 rows=5 loops=25)
                     Index Cond: (performer_id = p.id)
               ->  Bitmap Index Scan on performance_title_tgrm_idx  
(cost=0.00..4.51 rows=20 width=0) (actual time=14.545..14.545 rows=22 loops=24)
                     Index Cond: (title_ ~~ '%beatles%'::text)
 Total runtime: 352.996 ms
(13 rows)

shs-dev=# set enable_bitmapscan = off;
SET
shs-dev=# explain analyze select e.id, e.title from performance e join 
performer p on e.performer_id = p.id WHERE e.title_ ~~ '%beatles%' AND p.name_ 
~~ '%pepper%';
                                                                     QUERY PLAN 
                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=24.43..109.22 rows=1 width=22) (actual time=18.563..18.563 
rows=0 loops=1)
   Hash Cond: (e.performer_id = p.id)
   ->  Index Scan using performance_title_tgrm_idx on performance e  
(cost=0.00..84.71 rows=20 width=26) (actual time=1.237..15.098 rows=22 loops=1)
         Index Cond: (title_ ~~ '%beatles%'::text)
   ->  Hash  (cost=24.36..24.36 rows=5 width=4) (actual time=3.455..3.455 
rows=25 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         ->  Index Scan using performer_name__trgm_idx on performer p  
(cost=0.00..24.36 rows=5 width=4) (actual time=0.159..3.447 rows=25 loops=1)
               Index Cond: (name_ ~~ '%pepper%'::text)
 Total runtime: 18.590 ms
(9 rows)




shs-dev=# \d performance
                                       Table "public.performance"
      Column      |           Type           |                        Modifiers 
                        
------------------+--------------------------+----------------------------------------------------------
 created_by       | integer                  | not null
 creation_date    | timestamp with time zone | not null
 comments         | text                     | 
 owned_by         | integer                  | not null
 id               | integer                  | not null default 
nextval('performance_id_seq'::regclass)
 object_type      | text                     | not null default 
'performance'::text
 active           | boolean                  | not null default true
 editor_note      | text                     | 
 title            | text                     | 
 title_           | text                     | 
 performer_id     | integer                  | 
 first_release_id | integer                  | 
 vperf_id         | integer                  | 
 perf_date        | partial_date             | 
 bonustrack       | boolean                  | not null default false
 type_id          | integer                  | not null
 instrumental     | boolean                  | not null default false
 init_rev_level   | smallint                 | not null default 1
 curr_rev_level   | smallint                 | not null default 1
 revision_date    | timestamp with time zone | 
 revised_by       | integer                  | 
Indexes:
    "performance_pkey" PRIMARY KEY, btree (id)
    "performance_create_idx" btree (creation_date)
    "performance_medium_idx" btree (first_release_id)
    "performance_own_idx" btree (owned_by)
    "performance_performer_idx" btree (performer_id)
    "performance_title_tgrm_idx" gist (title_ extensions.gist_trgm_ops)
Check constraints:
    "active_check" CHECK (active)
    "bonus_medium" CHECK (NOT (bonustrack AND first_release_id IS NULL))
    "bonus_revision" CHECK (NOT (bonustrack AND curr_rev_level >= 1))
    "performance_object_type_check" CHECK (object_type = 'performance'::text)
Foreign-key constraints:
    "performance_first_medium_id_fkey" FOREIGN KEY (first_release_id) 
REFERENCES release(id)
    "performance_performer_id_fkey" FOREIGN KEY (performer_id) REFERENCES 
performer(id)
    "performance_revised_by_fkey" FOREIGN KEY (revised_by) REFERENCES dbuser(id)
    "performance_type_id_fkey" FOREIGN KEY (type_id) REFERENCES 
performance_type(id)
    "performance_vperf_id_fkey" FOREIGN KEY (vperf_id) REFERENCES vperf(id)
Referenced by:
    TABLE "didyouknow" CONSTRAINT "didyouknow_performance_id_fkey" FOREIGN KEY 
(performance_id) REFERENCES performance(id) ON DELETE CASCADE
    TABLE "part" CONSTRAINT "part_performance_id_fkey" FOREIGN KEY 
(performance_id) REFERENCES performance(id) ON DELETE CASCADE
    TABLE "perf_rating" CONSTRAINT "perf_rating_song_fkey" FOREIGN KEY 
(performance_id) REFERENCES performance(id) ON DELETE CASCADE
    TABLE "performance_itunes" CONSTRAINT 
"performance_itunes_performance_id_fkey" FOREIGN KEY (performance_id) 
REFERENCES performance(id) ON DELETE CASCADE
    TABLE "performance_youtube" CONSTRAINT 
"performance_youtube_performance_id_fk



shs-dev=# \d performer
                                     Table "public.performer"
    Column     |           Type           |                       Modifiers     
                   
---------------+--------------------------+--------------------------------------------------------
 created_by    | integer                  | not null
 creation_date | timestamp with time zone | 
 comments      | text                     | 
 owned_by      | integer                  | not null
 id            | integer                  | not null default 
nextval('performer_id_seq'::regclass)
 object_type   | text                     | not null default 'performer'::text
 active        | boolean                  | not null default true
 editor_note   | text                     | 
 name          | text                     | not null
 name_         | text                     | not null
Indexes:
    "performer_pkey" PRIMARY KEY, btree (id)
    "performer_name_idx" UNIQUE, btree (name)
    "performer_name__idx" btree (name_ text_pattern_ops)
    "performer_name__trgm_idx" gist (name_ extensions.gist_trgm_ops)
    "performer_own_idx" btree (owned_by)
Check constraints:
    "active_check" CHECK (active)
    "performer_object_type_check" CHECK (object_type = 'performer'::text)
Referenced by:
    TABLE "release" CONSTRAINT "medium_performer_id_fkey" FOREIGN KEY 
(performer_id) REFERENCES performer(id)
    TABLE "membership" CONSTRAINT "membership_performer_id_fkey" FOREIGN KEY 
(performer_id) REFERENCES performer(id) ON DELETE CASCADE
    TABLE "performance" CONSTRAINT "performance_performer_id_fkey" FOREIGN KEY 
(performer_id) REFERENCES performer(id)
Triggers:
    performer_data_trigger BEFORE INSERT OR DELETE OR UPDATE ON performer FOR 
EACH ROW EXECUTE PROCEDURE performer_data_trigger()
Inherits: primary_object
ey" FOREIGN KEY (performance_id) REFERENCES performance(id) ON UPDATE CASCADE 
ON DELETE CASCADE
    TABLE "sample" CONSTRAINT "sample_sampled_id_fkey" FOREIGN KEY (sampled_id) 
REFERENCES performance(id)
    TABLE "sample" CONSTRAINT "sample_sampler_id_fkey" FOREIGN KEY (sampler_id) 
REFERENCES performance(id)
    TABLE "track" CONSTRAINT "track_performance_id_fkey" FOREIGN KEY 
(performance_id) REFERENCES performance(id) ON DELETE CASCADE
    TABLE "work" CONSTRAINT "work_first_performance_id_fkey" FOREIGN KEY 
(first_performance_id) REFERENCES performance(id)
    TABLE "work" CONSTRAINT "work_first_recording_id_fkey" FOREIGN KEY 
(first_recording_id) REFERENCES performance(id)
    TABLE "work" CONSTRAINT "work_first_release_id_fkey" FOREIGN KEY 
(first_release_id) REFERENCES performance(id)
Triggers:
    performance_data_trigger BEFORE INSERT OR DELETE OR UPDATE ON performance 
FOR EACH ROW EXECUTE PROCEDURE performance_data_trigger()
Inherits: primary_object





Attachment: postgresql.conf
Description: Binary data

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to