2014-11-10 18:43 GMT+01:00 Eric Ramirez <eric.ramirez...@gmail.com>: > > Hi, > I have created a sample database with test data to help benchmark our > application. The database has ten million records, and is running on a > dedicated server(postgres 9.3) with 8GB of RAM. Our queries are pretty > slow with this amount of data and is my job to get them to run to at > acceptable speed. First thing that I notice was that the planner's row > estimates are off by a large number or records (millions) I have updated > the statistics target but didn't seem to make a difference. The relevant > output follows. > Am I looking in the wrong place, something else I should be trying? > Thanks in advance for your comments/suggestions, > Eric. > > > =# show work_mem; > work_mem > ---------- > 1GB > (1 row) > =# show effective_cache_size; > effective_cache_size > ---------------------- > 5GB > (1 row) > > =#ALTER TABLE TAR_MVW_TARGETING_RECORD ALTER COLUMN > household_member_first_name SET STATISTICS 5000; > =# vacuum analyse TAR_MVW_TARGETING_RECORD; > > =# \d tar_mvw_targeting_record; > Table "public.tar_mvw_targeting_record" > Column | Type | Modifiers > -----------------------------+-----------------------+----------- > household_member_id | bigint | > form_id | bigint | > status | character varying(64) | > gender | character varying(64) | > household_member_first_name | character varying(64) | > household_member_last_name | character varying(64) | > > Indexes: > "tar_mvw_targeting_record_form_id_household_member_id_idx" UNIQUE, > btree (form_id, household_member_id) > "tar_mvw_targeting_record_lower_idx" gist > (lower(household_member_first_name::text) extensions.gist_trgm_ops) > WHERE status::text <> 'ANULLED'::text > "tar_mvw_targeting_record_lower_idx1" gist > (lower(household_member_last_name::text) extensions.gist_trgm_ops) > WHERE status::text <> 'ANULLED'::text > > > =# explain (analyse on,buffers on)select T.form_id from > TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED') AND > LOWER(T.household_member_last_name) LIKE LOWER('%tu%') AND > T.gender='FEMALE' group by T.form_id; > > QUERY PLAN > > > ------------------------------------------------------------------------------------------------------------------------------------------- > ------------------------------- > HashAggregate (cost=450994.35..452834.96 rows=184061 width=8) (actual > time=11932.959..12061.206 rows=442453 loops=1) > Buffers: shared hit=307404 read=109743 > -> Bitmap Heap Scan on tar_mvw_targeting_record t > (cost=110866.33..448495.37 rows=999592 width=8) (actual > time=3577.301..11629.132 row > s=500373 loops=1) > Recheck Cond: ((lower((household_member_last_name)::text) ~~ > '%tu%'::text) AND ((status)::text <> 'ANULLED'::text)) > Rows Removed by Index Recheck: 9000079 > Filter: ((gender)::text = 'FEMALE'::text) > Rows Removed by Filter: 499560 > Buffers: shared hit=307404 read=109743 > -> Bitmap Index Scan on tar_mvw_targeting_record_lower_idx1 > (cost=0.00..110616.43 rows=2000002 width=0) (actual time=3471.142..3 > 471.142 rows=10000012 loops=1) > Index Cond: (lower((household_member_last_name)::text) ~~ > '%tu%'::text) > Buffers: shared hit=36583 read=82935 > Total runtime: 12092.059 ms > (12 rows) > > Time: 12093.107 ms > > p.s. this plan was ran three times, first time took 74 seconds. > > > Hello Eric, did you try with gin index instead ? so you could avoid, if possible, the recheck condition (almost the gin index is not lossy ), further if you always use a predicate like "gender=" , you could think to partition the indexes based on that predicate (where status NOT IN ('ANULLED') and gender='FEMALE', in the other case it wil be where status NOT IN ('ANULLED') and gender='MALE' ) . Moreover you could avoid also the "lower" operator and try use directly the ilike , instead of "like".
CREATE INDEX tar_mvw_targeting_record_idx02 ON tar_mvw_targeting_record USING gin ( status gin_trgm_ops) where status NOT IN ('ANULLED') and gender='FEMALE' ; CREATE INDEX tar_mvw_targeting_record_idx03 ON tar_mvw_targeting_record USING gin ( status gin_trgm_ops) where status NOT IN ('ANULLED') and gender='MALE' ; explain (analyse on,buffers on) select T.form_id from TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED') AND T.household_member_last_name ilike LOWER('%tu%') AND T.gender='FEMALE' group by T.form_id; I hope it works have a nice day -- Matteo Durighetto - - - - - - - - - - - - - - - - - - - - - - - Italian PostgreSQL User Group <http://www.itpug.org/index.it.html> Italian Community for Geographic Free/Open-Source Software <http://www.gfoss.it>