On Sun, Jan 7, 2018 at 9:47 AM, Greg Landrum <greg.land...@gmail.com> wrote:

> Hi Rajarshi,
>
> On Sat, Jan 6, 2018 at 6:32 AM, Rajarshi Guha <rajarshi.g...@gmail.com>
> wrote:
>
>> Hi, I'm using RDkit 2017.09 with Postgres 9.5. I loaded in ChEMBL23 using
>> the instructions given at http://www.rdkit.org/docs/Cartridge.html and
>> set up the indexes etc.
>>
>> However, when I run a substructure query I, EXPLAIN ANALYZE output looks
>> like
>>
>> chembl_23=> explain analyze select count(*) from rdk.mols where m@
>> >'C1CC2CCC3C(CCC4CCCCC34)C2C1';
>>
>>                                                             QUERY PLAN
>>
>>
>> ------------------------------------------------------------
>> ----------------------------------------------------------------------
>>
>>  Aggregate  (cost=6564.86..6564.87 rows=1 width=0) (actual
>> time=10546.899..10546.899 rows=1 loops=1)
>>
>>    ->  Bitmap Heap Scan on mols  (cost=369.80..6560.54 rows=1727
>> width=0) (actual time=465.881..10539.663 rows=8170 loops=1)
>>
>>          Recheck Cond: (m @> 'C1CCC2C(C1)CCC1C3CCCC3CCC21'::mol)
>>
>>          Rows Removed by Index Recheck: 16131
>>
>>          Heap Blocks: exact=9458
>>
>>          ->  Bitmap Index Scan on molidx  (cost=0.00..369.37 rows=1727
>> width=0) (actual time=460.029..460.029 rows=24301 loops=1)
>>
>>                Index Cond: (m @> 'C1CCC2C(C1)CCC1C3CCCC3CCC21'::mol)
>>
>>  Planning time: 1.258 ms
>>
>>  Execution time: 10548.293 ms
>>
>> While it's using the GiST index, I note the big difference in the
>> expected (1727) and actual (24301) row counts for the Bitmap Index Scan
>> node.
>>
>> This seems to suggest that the index statistics are not accurate. Has
>> anybody noticed this? Would fiddling with planner settings (such
>> as default_statistics_target) be useful for this?
>>
>
> Almost certainly
>

Surprisingly, changing this setting and reindexing the column doesn't
change anything, so it does look like more needs to be done to integrate
with the planner (or at least provide statistics somehow)

Thanks for the pointers
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to