On Mon, Mar 3, 2014 at 9:17 PM, acanada <acan...@cnio.es> wrote:

> Hello,
>
> Thankyou for your answer.
> I have made more changes than a simple re-indexing recently. I have moved
> the sorting field to the table in order to avoid the join clause. Now the
> schema is very simple. The query only implies one table:
>
> x=> \d+ entity_compounddict2document;
>                       Table "public.entity_compounddict2document"
>       Column      |              Type              | Modifiers | Storage
>  | Description
>
> ------------------+--------------------------------+-----------+----------+-------------
>  id               | integer                        | not null  | plain
>  |
>  document_id      | integer                        |           | plain
>  |
>  name             | character varying(255)         |           | extended
> |
>  qualifier        | character varying(255)         |           | extended
> |
>  tagMethod        | character varying(255)         |           | extended
> |
>  created          | timestamp(0) without time zone |           | plain
>  |
>  updated          | timestamp(0) without time zone |           | plain
>  |
>  curation         | integer                        |           | plain
>  |
>  hepval           | double precision               |           | plain
>  |
>  cardval          | double precision               |           | plain
>  |
>  nephval          | double precision               |           | plain
>  |
>  phosval          | double precision               |           | plain
>  |
>  patternCount     | double precision               |           | plain
>  |
>  ruleScore        | double precision               |           | plain
>  |
>  hepTermNormScore | double precision               |           | plain
>  |
>  hepTermVarScore  | double precision               |           | plain
>  |
> Indexes:
>     "entity_compounddict2document_pkey" PRIMARY KEY, btree (id)
>     "entity_compound2document_cardval" btree (cardval)
>     "entity_compound2document_heptermnormscore" btree ("hepTermNormScore")
>     "entity_compound2document_heptermvarscore" btree ("hepTermVarScore")
>     "entity_compound2document_hepval" btree (hepval)
>     "entity_compound2document_name" btree (name)
>     "entity_compound2document_nephval" btree (nephval)
>     "entity_compound2document_patterncount" btree ("patternCount")
>     "entity_compound2document_phosval" btree (phosval)
>     "entity_compound2document_rulescore" btree ("ruleScore")
> Has OIDs: no
>
>            tablename            |                   indexname
>                                  |  num_rows    | table_size  | index_size
> | unique | number_of_scans | tuples_read | tuples_fetched
>  entity_compounddict2document   | entity_compound2document_cardval
>       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   |
> entity_compound2document_heptermnormscore      | 5.42452e+07 | 6763 MB    |
> 1162 MB    | Y      |               0 |           0 |              0
>  entity_compounddict2document   | entity_compound2document_heptermvarscore
>       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compound2document_hepval
>        | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compound2document_name
>        | 5.42452e+07 | 6763 MB    | 1505 MB    | Y      |              24 |
>      178680 |              0
>  entity_compounddict2document   | entity_compound2document_nephval
>       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compound2document_patterncount
>        | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compound2document_phosval
>       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compound2document_rulescore
>       | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>  entity_compounddict2document   | entity_compounddict2document_pkey
>        | 5.42452e+07 | 6763 MB    | 1162 MB    | Y      |               0 |
>           0 |              0
>
> The table has aprox. 54,000,000 rows
> There are no NULLs in hepval field and pg_settings haven't changed. I also
> have done "analyze" to this table.
>
> I have simplified the query and added the last advise that you told me:
>
> Query:
>
>  explain analyze select * from (select * from entity_compounddict2document
>  where name='ranitidine') as a order by a.hepval;
>
> QUERY PLAN
>
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=11060.50..11067.55 rows=2822 width=133) (actual
> time=32715.097..32716.488 rows=13512 loops=1)
>    Sort Key: entity_compounddict2document.hepval
>    Sort Method:  quicksort  Memory: 2301kB
>    ->  Bitmap Heap Scan on entity_compounddict2document
>  (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483
> rows=13512 loops=1)
>          Recheck Cond: ((name)::text = 'ranitidine'::text)
>          ->  Bitmap Index Scan on entity_compound2document_name
>  (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512
> loops=1)
>                Index Cond: ((name)::text = 'ranitidine'::text)
>  Total runtime: 32717.548 ms
>
> Another query:
>  explain analyze select * from (select * from
> entity_compounddict2document  where name='progesterone' ) as a  order by
> a.hepval;
>
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=367879.25..368209.24 rows=131997 width=133) (actual
> time=9262.887..9287.046 rows=138165 loops=1)
>    Sort Key: entity_compounddict2document.hepval
>    Sort Method:  quicksort  Memory: 25622kB
>    ->  Bitmap Heap Scan on entity_compounddict2document
>  (cost=2906.93..356652.81 rows=131997 width=133) (actual
> time=76.316..9038.485 rows=138165 loops=1)
>          Recheck Cond: ((name)::text = 'progesterone'::text)
>          ->  Bitmap Index Scan on entity_compound2document_name
>  (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913
> rows=138165 loops=1)
>                Index Cond: ((name)::text = 'progesterone'::text)
>  Total runtime: 9296.815 ms
>
>
> It has improved (I supose because of the lack of the join table) but still
> taking a lot of time... Anything I can do??
>
> Any help would be very appreciated. Thank you very much.
>


Good to know performance has increased.

"entity_compounddict2document" table goes through high INSERTS ?

Can you help us know if the "helpval" column and "name" column have high
duplicate values ? "n_distinct" value from pg_stats table would have that
info.

Below could be a possible workaround -

As mentioned earlier in this email, a composite Index on name and hepval
column might help. If the table does not go through lot of INSERTS, then
consider performing a CLUSTER on the table using the same INDEX.

Other recommendations -

Please drop off all the Non-primary key Indexes which have 0 scans / hits.
This would harm the DB and the DB server whilst maintenance and DML
operations.

Regards,
Venkata Balaji N

Fujitsu Australia

Reply via email to