On Mon, Mar 3, 2014 at 9:17 PM, acanada <acan...@cnio.es> wrote: > Hello, > > Thankyou for your answer. > I have made more changes than a simple re-indexing recently. I have moved > the sorting field to the table in order to avoid the join clause. Now the > schema is very simple. The query only implies one table: > > x=> \d+ entity_compounddict2document; > Table "public.entity_compounddict2document" > Column | Type | Modifiers | Storage > | Description > > ------------------+--------------------------------+-----------+----------+------------- > id | integer | not null | plain > | > document_id | integer | | plain > | > name | character varying(255) | | extended > | > qualifier | character varying(255) | | extended > | > tagMethod | character varying(255) | | extended > | > created | timestamp(0) without time zone | | plain > | > updated | timestamp(0) without time zone | | plain > | > curation | integer | | plain > | > hepval | double precision | | plain > | > cardval | double precision | | plain > | > nephval | double precision | | plain > | > phosval | double precision | | plain > | > patternCount | double precision | | plain > | > ruleScore | double precision | | plain > | > hepTermNormScore | double precision | | plain > | > hepTermVarScore | double precision | | plain > | > Indexes: > "entity_compounddict2document_pkey" PRIMARY KEY, btree (id) > "entity_compound2document_cardval" btree (cardval) > "entity_compound2document_heptermnormscore" btree ("hepTermNormScore") > "entity_compound2document_heptermvarscore" btree ("hepTermVarScore") > "entity_compound2document_hepval" btree (hepval) > "entity_compound2document_name" btree (name) > "entity_compound2document_nephval" btree (nephval) > "entity_compound2document_patterncount" btree ("patternCount") > "entity_compound2document_phosval" btree (phosval) > "entity_compound2document_rulescore" btree ("ruleScore") > Has OIDs: no > > tablename | indexname > | num_rows | table_size | index_size > | unique | number_of_scans | tuples_read | tuples_fetched > entity_compounddict2document | entity_compound2document_cardval > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | > entity_compound2document_heptermnormscore | 5.42452e+07 | 6763 MB | > 1162 MB | Y | 0 | 0 | 0 > entity_compounddict2document | entity_compound2document_heptermvarscore > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compound2document_hepval > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compound2document_name > | 5.42452e+07 | 6763 MB | 1505 MB | Y | 24 | > 178680 | 0 > entity_compounddict2document | entity_compound2document_nephval > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compound2document_patterncount > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compound2document_phosval > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compound2document_rulescore > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > entity_compounddict2document | entity_compounddict2document_pkey > | 5.42452e+07 | 6763 MB | 1162 MB | Y | 0 | > 0 | 0 > > The table has aprox. 54,000,000 rows > There are no NULLs in hepval field and pg_settings haven't changed. I also > have done "analyze" to this table. > > I have simplified the query and added the last advise that you told me: > > Query: > > explain analyze select * from (select * from entity_compounddict2document > where name='ranitidine') as a order by a.hepval; > > QUERY PLAN > > > ------------------------------------------------------------------------------------------------------------------------------------------------------ > Sort (cost=11060.50..11067.55 rows=2822 width=133) (actual > time=32715.097..32716.488 rows=13512 loops=1) > Sort Key: entity_compounddict2document.hepval > Sort Method: quicksort Memory: 2301kB > -> Bitmap Heap Scan on entity_compounddict2document > (cost=73.82..10898.76 rows=2822 width=133) (actual time=6.034..32695.483 > rows=13512 loops=1) > Recheck Cond: ((name)::text = 'ranitidine'::text) > -> Bitmap Index Scan on entity_compound2document_name > (cost=0.00..73.12 rows=2822 width=0) (actual time=3.221..3.221 rows=13512 > loops=1) > Index Cond: ((name)::text = 'ranitidine'::text) > Total runtime: 32717.548 ms > > Another query: > explain analyze select * from (select * from > entity_compounddict2document where name='progesterone' ) as a order by > a.hepval; > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------ > Sort (cost=367879.25..368209.24 rows=131997 width=133) (actual > time=9262.887..9287.046 rows=138165 loops=1) > Sort Key: entity_compounddict2document.hepval > Sort Method: quicksort Memory: 25622kB > -> Bitmap Heap Scan on entity_compounddict2document > (cost=2906.93..356652.81 rows=131997 width=133) (actual > time=76.316..9038.485 rows=138165 loops=1) > Recheck Cond: ((name)::text = 'progesterone'::text) > -> Bitmap Index Scan on entity_compound2document_name > (cost=0.00..2873.93 rows=131997 width=0) (actual time=40.913..40.913 > rows=138165 loops=1) > Index Cond: ((name)::text = 'progesterone'::text) > Total runtime: 9296.815 ms > > > It has improved (I supose because of the lack of the join table) but still > taking a lot of time... Anything I can do?? > > Any help would be very appreciated. Thank you very much. >
Good to know performance has increased. "entity_compounddict2document" table goes through high INSERTS ? Can you help us know if the "helpval" column and "name" column have high duplicate values ? "n_distinct" value from pg_stats table would have that info. Below could be a possible workaround - As mentioned earlier in this email, a composite Index on name and hepval column might help. If the table does not go through lot of INSERTS, then consider performing a CLUSTER on the table using the same INDEX. Other recommendations - Please drop off all the Non-primary key Indexes which have 0 scans / hits. This would harm the DB and the DB server whilst maintenance and DML operations. Regards, Venkata Balaji N Fujitsu Australia