Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?
Hi Brian, I just did a bit of looking here and either something has changed since my first experiments with 9.6 or I was remembering incorrectly. The functions exposed in the cartridge need to be marked as being "parallel safe" in order to be usable in a parallel query. At the moment none of them are. This would clearly be useful, so I'm going to start taking a look at adding the relevant flags. -greg On Fri, Jun 1, 2018 at 5:05 PM Brian Cole wrote: > Doesn't appear like ::mol parallelized either. Only seeing the following > use 1 CPU in top. > > ligandlibrary=# explain analyze select count(*) from ligands where > rdkit_mol@>'Br'::mol; > > QUERY PLAN > > > - > Aggregate (cost=50959.06..50959.07 rows=1 width=8) (actual > time=791284.354..791284.354 rows=1 loops=1) >-> Bitmap Heap Scan on ligands (cost=3156.60..50926.74 rows=12927 > width=0) (actual time=252201.744..790985.637 rows=667236 loops=1) > Recheck Cond: (rdkit_mol @> 'Br'::mol) > Rows Removed by Index Recheck: 13725739 > Heap Blocks: exact=42169 lossy=1254494 > -> Bitmap Index Scan on rdkit_substructure_idx > (cost=0.00..3153.37 rows=12927 width=0) (actual time=252166.576..252166.576 > rows=14511013 loops=1) >Index Cond: (rdkit_mol @> 'Br'::mol) > Planning time: 0.109 ms > Execution time: 791284.588 ms > (9 rows) > > Time: 791385.473 ms (13:11.385) > ligandlibrary=# select name, setting from pg_settings where name like > 'dynamic_shared_memory_type'; > name| setting > +- > dynamic_shared_memory_type | posix > (1 row) > > Time: 41.439 ms > ligandlibrary=# select name, setting from pg_settings where name like > 'max_parallel_workers_per_gather'; > name | setting > -+- > max_parallel_workers_per_gather | 2 > (1 row) > > Time: 0.926 ms > ligandlibrary=# > > Maybe some other flag I need to specify. Only 2 cores in this system at > the moment, maybe it only parallelizes when there's more than 2 cores? > > Thanks, > Brian > > > On Fri, Jun 1, 2018 at 10:07 AM, Greg Landrum > wrote: > >> I think they should. Does a ::mol query on the same table parallelize? If >> it does but a ::qmol query does not maybe I forgot something in the SQL >> function definitions >> >> On Fri, 1 Jun 2018 at 15:43, Brian Cole wrote: >> >>> Hi Greg, >>> >>> Are SMARTS searches with the ::qmol type supposed to parallelize? They >>> don't appear to be either. >>> >>> -Brian >>> >>> On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum >>> wrote: >>> Hi Brian, When the new parallel queries came out I checked that they actually could be used and things seemed fine. The problem (and it's a sizable one) is that parallel queries don't use the index. Until parallel scans using GIST indices work, I don't think this is really going to help much. -greg On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: > It appears like Postgres 9.6+ supports parallel queries now to > accelerate slow queries: > https://www.postgresql.org/docs/10/static/parallel-query.html > > Has anyone successfully got this to accelerate substructure queries > with the RDKit Postgres cartridge? > > Thanks, > Brian > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >>> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?
Doesn't appear like ::mol parallelized either. Only seeing the following use 1 CPU in top. ligandlibrary=# explain analyze select count(*) from ligands where rdkit_mol@>'Br'::mol; QUERY PLAN - Aggregate (cost=50959.06..50959.07 rows=1 width=8) (actual time=791284.354..791284.354 rows=1 loops=1) -> Bitmap Heap Scan on ligands (cost=3156.60..50926.74 rows=12927 width=0) (actual time=252201.744..790985.637 rows=667236 loops=1) Recheck Cond: (rdkit_mol @> 'Br'::mol) Rows Removed by Index Recheck: 13725739 Heap Blocks: exact=42169 lossy=1254494 -> Bitmap Index Scan on rdkit_substructure_idx (cost=0.00..3153.37 rows=12927 width=0) (actual time=252166.576..252166.576 rows=14511013 loops=1) Index Cond: (rdkit_mol @> 'Br'::mol) Planning time: 0.109 ms Execution time: 791284.588 ms (9 rows) Time: 791385.473 ms (13:11.385) ligandlibrary=# select name, setting from pg_settings where name like 'dynamic_shared_memory_type'; name| setting +- dynamic_shared_memory_type | posix (1 row) Time: 41.439 ms ligandlibrary=# select name, setting from pg_settings where name like 'max_parallel_workers_per_gather'; name | setting -+- max_parallel_workers_per_gather | 2 (1 row) Time: 0.926 ms ligandlibrary=# Maybe some other flag I need to specify. Only 2 cores in this system at the moment, maybe it only parallelizes when there's more than 2 cores? Thanks, Brian On Fri, Jun 1, 2018 at 10:07 AM, Greg Landrum wrote: > I think they should. Does a ::mol query on the same table parallelize? If > it does but a ::qmol query does not maybe I forgot something in the SQL > function definitions > > On Fri, 1 Jun 2018 at 15:43, Brian Cole wrote: > >> Hi Greg, >> >> Are SMARTS searches with the ::qmol type supposed to parallelize? They >> don't appear to be either. >> >> -Brian >> >> On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum >> wrote: >> >>> Hi Brian, >>> >>> When the new parallel queries came out I checked that they actually >>> could be used and things seemed fine. >>> The problem (and it's a sizable one) is that parallel queries don't use >>> the index. Until parallel scans using GIST indices work, I don't think this >>> is really going to help much. >>> >>> -greg >>> >>> >>> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: >>> It appears like Postgres 9.6+ supports parallel queries now to accelerate slow queries: https://www.postgresql.org/docs/10/static/parallel-query.html Has anyone successfully got this to accelerate substructure queries with the RDKit Postgres cartridge? Thanks, Brian -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot__ _ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?
I think they should. Does a ::mol query on the same table parallelize? If it does but a ::qmol query does not maybe I forgot something in the SQL function definitions On Fri, 1 Jun 2018 at 15:43, Brian Cole wrote: > Hi Greg, > > Are SMARTS searches with the ::qmol type supposed to parallelize? They > don't appear to be either. > > -Brian > > On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum > wrote: > >> Hi Brian, >> >> When the new parallel queries came out I checked that they actually could >> be used and things seemed fine. >> The problem (and it's a sizable one) is that parallel queries don't use >> the index. Until parallel scans using GIST indices work, I don't think this >> is really going to help much. >> >> -greg >> >> >> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: >> >>> It appears like Postgres 9.6+ supports parallel queries now to >>> accelerate slow queries: >>> https://www.postgresql.org/docs/10/static/parallel-query.html >>> >>> Has anyone successfully got this to accelerate substructure queries with >>> the RDKit Postgres cartridge? >>> >>> Thanks, >>> Brian >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?
Hi Greg, Are SMARTS searches with the ::qmol type supposed to parallelize? They don't appear to be either. -Brian On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum wrote: > Hi Brian, > > When the new parallel queries came out I checked that they actually could > be used and things seemed fine. > The problem (and it's a sizable one) is that parallel queries don't use > the index. Until parallel scans using GIST indices work, I don't think this > is really going to help much. > > -greg > > > On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: > >> It appears like Postgres 9.6+ supports parallel queries now to accelerate >> slow queries: >> https://www.postgresql.org/docs/10/static/parallel-query.html >> >> Has anyone successfully got this to accelerate substructure queries with >> the RDKit Postgres cartridge? >> >> Thanks, >> Brian >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot__ >> _ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?
Hi Brian, When the new parallel queries came out I checked that they actually could be used and things seemed fine. The problem (and it's a sizable one) is that parallel queries don't use the index. Until parallel scans using GIST indices work, I don't think this is really going to help much. -greg On Fri, Jun 1, 2018 at 12:04 AM Brian Cole wrote: > It appears like Postgres 9.6+ supports parallel queries now to accelerate > slow queries: > https://www.postgresql.org/docs/10/static/parallel-query.html > > Has anyone successfully got this to accelerate substructure queries with > the RDKit Postgres cartridge? > > Thanks, > Brian > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss