2014-11-10 18:43 GMT+01:00 Eric Ramirez <eric.ramirez...@gmail.com>:

>
> Hi,
> I have created a sample database with test data to help benchmark our
> application. The database has ten million records, and is running on a
> dedicated server(postgres 9.3) with 8GB of RAM.  Our queries are pretty
> slow with this amount of data and is my job to get them to run to at
> acceptable speed. First thing that I notice was that the planner's row
> estimates are off by a large number or records (millions) I have updated
> the statistics target but didn't seem to make a difference. The relevant
> output follows.
> Am I looking in the wrong place, something else I should be trying?
> Thanks in advance for your comments/suggestions,
> Eric.
>
>
> =# show work_mem;
>  work_mem
> ----------
>  1GB
> (1 row)
> =# show effective_cache_size;
>  effective_cache_size
> ----------------------
>  5GB
> (1 row)
>
> =#ALTER TABLE TAR_MVW_TARGETING_RECORD ALTER COLUMN
> household_member_first_name SET STATISTICS 5000;
> =# vacuum analyse TAR_MVW_TARGETING_RECORD;
>
> =# \d tar_mvw_targeting_record;
>              Table "public.tar_mvw_targeting_record"
>            Column            |         Type          | Modifiers
> -----------------------------+-----------------------+-----------
>  household_member_id         | bigint                |
>  form_id                     | bigint                |
>  status                      | character varying(64) |
>  gender                      | character varying(64) |
>  household_member_first_name | character varying(64) |
>  household_member_last_name  | character varying(64) |
>
> Indexes:
>    "tar_mvw_targeting_record_form_id_household_member_id_idx" UNIQUE,
> btree (form_id, household_member_id)
>  "tar_mvw_targeting_record_lower_idx" gist
> (lower(household_member_first_name::text) extensions.gist_trgm_ops)
>  WHERE status::text <> 'ANULLED'::text
>     "tar_mvw_targeting_record_lower_idx1" gist
> (lower(household_member_last_name::text) extensions.gist_trgm_ops)
>  WHERE status::text <> 'ANULLED'::text
>
>
> =# explain (analyse on,buffers on)select T.form_id from
> TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED')  AND
> LOWER(T.household_member_last_name) LIKE LOWER('%tu%') AND
> T.gender='FEMALE' group by T.form_id;
>
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------
> -------------------------------
>  HashAggregate  (cost=450994.35..452834.96 rows=184061 width=8) (actual
> time=11932.959..12061.206 rows=442453 loops=1)
>    Buffers: shared hit=307404 read=109743
>    ->  Bitmap Heap Scan on tar_mvw_targeting_record t
> (cost=110866.33..448495.37 rows=999592 width=8) (actual
> time=3577.301..11629.132 row
> s=500373 loops=1)
>          Recheck Cond: ((lower((household_member_last_name)::text) ~~
> '%tu%'::text) AND ((status)::text <> 'ANULLED'::text))
>          Rows Removed by Index Recheck: 9000079
>          Filter: ((gender)::text = 'FEMALE'::text)
>          Rows Removed by Filter: 499560
>          Buffers: shared hit=307404 read=109743
>          ->  Bitmap Index Scan on tar_mvw_targeting_record_lower_idx1
> (cost=0.00..110616.43 rows=2000002 width=0) (actual time=3471.142..3
> 471.142 rows=10000012 loops=1)
>                Index Cond: (lower((household_member_last_name)::text) ~~
> '%tu%'::text)
>                Buffers: shared hit=36583 read=82935
>  Total runtime: 12092.059 ms
> (12 rows)
>
> Time: 12093.107 ms
>
> p.s. this plan was ran three times, first time took 74 seconds.
>
>
>
Hello Eric,
                        did you try with gin index instead ? so you could
avoid, if possible, the recheck condition (almost the gin index is not
lossy ), further if you always use a predicate like "gender=" , you could
think to partition the indexes based on that predicate (where status NOT IN
('ANULLED')  and gender='FEMALE', in the other case it wil be where status
NOT IN ('ANULLED')  and gender='MALE' ) . Moreover you could avoid also the
"lower" operator and try use directly the ilike , instead of "like".

CREATE INDEX tar_mvw_targeting_record_idx02 ON
tar_mvw_targeting_record USING gin (  status gin_trgm_ops) where
status NOT IN ('ANULLED')  and gender='FEMALE' ;
CREATE INDEX tar_mvw_targeting_record_idx03 ON
tar_mvw_targeting_record USING gin (  status gin_trgm_ops) where
status NOT IN ('ANULLED')  and gender='MALE' ;


explain (analyse on,buffers on)  select T.form_id from
TAR_MVW_targeting_record AS T where T.status NOT IN ('ANULLED')  AND
T.household_member_last_name ilike LOWER('%tu%') AND T.gender='FEMALE'
group by T.form_id;


 I hope it works

have a nice day


-- 
Matteo Durighetto

- - - - - - - - - - - - - - - - - - - - - - -

Italian PostgreSQL User Group <http://www.itpug.org/index.it.html>
Italian Community for Geographic Free/Open-Source Software
<http://www.gfoss.it>

Reply via email to