Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Greg Landrum
Hi Brian,

I just did a bit of looking here and either something has changed since my
first experiments with 9.6 or I was remembering incorrectly. The functions
exposed in the cartridge need to be marked as being "parallel safe" in
order to be usable in a parallel query. At the moment none of them are.
This would clearly be useful, so I'm going to start taking a look at adding
the relevant flags.

-greg


On Fri, Jun 1, 2018 at 5:05 PM Brian Cole  wrote:

> Doesn't appear like ::mol parallelized either. Only seeing the following
> use 1 CPU in top.
>
> ligandlibrary=# explain analyze select count(*) from ligands where
> rdkit_mol@>'Br'::mol;
>
>  QUERY PLAN
>
>
> -
>  Aggregate  (cost=50959.06..50959.07 rows=1 width=8) (actual
> time=791284.354..791284.354 rows=1 loops=1)
>->  Bitmap Heap Scan on ligands  (cost=3156.60..50926.74 rows=12927
> width=0) (actual time=252201.744..790985.637 rows=667236 loops=1)
>  Recheck Cond: (rdkit_mol @> 'Br'::mol)
>  Rows Removed by Index Recheck: 13725739
>  Heap Blocks: exact=42169 lossy=1254494
>  ->  Bitmap Index Scan on rdkit_substructure_idx
> (cost=0.00..3153.37 rows=12927 width=0) (actual time=252166.576..252166.576
> rows=14511013 loops=1)
>Index Cond: (rdkit_mol @> 'Br'::mol)
>  Planning time: 0.109 ms
>  Execution time: 791284.588 ms
> (9 rows)
>
> Time: 791385.473 ms (13:11.385)
> ligandlibrary=# select name, setting from pg_settings where name like
> 'dynamic_shared_memory_type';
> name| setting
> +-
>  dynamic_shared_memory_type | posix
> (1 row)
>
> Time: 41.439 ms
> ligandlibrary=# select name, setting from pg_settings where name like
> 'max_parallel_workers_per_gather';
>   name   | setting
> -+-
>  max_parallel_workers_per_gather | 2
> (1 row)
>
> Time: 0.926 ms
> ligandlibrary=#
>
> Maybe some other flag I need to specify. Only 2 cores in this system at
> the moment, maybe it only parallelizes when there's more than 2 cores?
>
> Thanks,
> Brian
>
>
> On Fri, Jun 1, 2018 at 10:07 AM, Greg Landrum 
> wrote:
>
>> I think they should. Does a ::mol query on the same table parallelize? If
>> it does but a ::qmol query does not maybe I forgot something in the SQL
>> function definitions
>>
>> On Fri, 1 Jun 2018 at 15:43, Brian Cole  wrote:
>>
>>> Hi Greg,
>>>
>>> Are SMARTS searches with the ::qmol type supposed to parallelize? They
>>> don't appear to be either.
>>>
>>> -Brian
>>>
>>> On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum 
>>> wrote:
>>>
 Hi Brian,

 When the new parallel queries came out I checked that they actually
 could be used and things seemed fine.
 The problem (and it's a sizable one) is that parallel queries don't use
 the index. Until parallel scans using GIST indices work, I don't think this
 is really going to help much.

 -greg


 On Fri, Jun 1, 2018 at 12:04 AM Brian Cole  wrote:

> It appears like Postgres 9.6+ supports parallel queries now to
> accelerate slow queries:
> https://www.postgresql.org/docs/10/static/parallel-query.html
>
> Has anyone successfully got this to accelerate substructure queries
> with the RDKit Postgres cartridge?
>
> Thanks,
> Brian
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>

>>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Brian Cole
Doesn't appear like ::mol parallelized either. Only seeing the following
use 1 CPU in top.

ligandlibrary=# explain analyze select count(*) from ligands where
rdkit_mol@>'Br'::mol;

 QUERY PLAN

-
 Aggregate  (cost=50959.06..50959.07 rows=1 width=8) (actual
time=791284.354..791284.354 rows=1 loops=1)
   ->  Bitmap Heap Scan on ligands  (cost=3156.60..50926.74 rows=12927
width=0) (actual time=252201.744..790985.637 rows=667236 loops=1)
 Recheck Cond: (rdkit_mol @> 'Br'::mol)
 Rows Removed by Index Recheck: 13725739
 Heap Blocks: exact=42169 lossy=1254494
 ->  Bitmap Index Scan on rdkit_substructure_idx
(cost=0.00..3153.37 rows=12927 width=0) (actual time=252166.576..252166.576
rows=14511013 loops=1)
   Index Cond: (rdkit_mol @> 'Br'::mol)
 Planning time: 0.109 ms
 Execution time: 791284.588 ms
(9 rows)

Time: 791385.473 ms (13:11.385)
ligandlibrary=# select name, setting from pg_settings where name like
'dynamic_shared_memory_type';
name| setting
+-
 dynamic_shared_memory_type | posix
(1 row)

Time: 41.439 ms
ligandlibrary=# select name, setting from pg_settings where name like
'max_parallel_workers_per_gather';
  name   | setting
-+-
 max_parallel_workers_per_gather | 2
(1 row)

Time: 0.926 ms
ligandlibrary=#

Maybe some other flag I need to specify. Only 2 cores in this system at the
moment, maybe it only parallelizes when there's more than 2 cores?

Thanks,
Brian


On Fri, Jun 1, 2018 at 10:07 AM, Greg Landrum 
wrote:

> I think they should. Does a ::mol query on the same table parallelize? If
> it does but a ::qmol query does not maybe I forgot something in the SQL
> function definitions
>
> On Fri, 1 Jun 2018 at 15:43, Brian Cole  wrote:
>
>> Hi Greg,
>>
>> Are SMARTS searches with the ::qmol type supposed to parallelize? They
>> don't appear to be either.
>>
>> -Brian
>>
>> On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum 
>> wrote:
>>
>>> Hi Brian,
>>>
>>> When the new parallel queries came out I checked that they actually
>>> could be used and things seemed fine.
>>> The problem (and it's a sizable one) is that parallel queries don't use
>>> the index. Until parallel scans using GIST indices work, I don't think this
>>> is really going to help much.
>>>
>>> -greg
>>>
>>>
>>> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole  wrote:
>>>
 It appears like Postgres 9.6+ supports parallel queries now to
 accelerate slow queries:
 https://www.postgresql.org/docs/10/static/parallel-query.html

 Has anyone successfully got this to accelerate substructure queries
 with the RDKit Postgres cartridge?

 Thanks,
 Brian

 
 --
 Check out the vibrant tech community on one of the world's most
 engaging tech sites, Slashdot.org! http://sdm.link/slashdot__
 _
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

>>>
>>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Greg Landrum
I think they should. Does a ::mol query on the same table parallelize? If
it does but a ::qmol query does not maybe I forgot something in the SQL
function definitions

On Fri, 1 Jun 2018 at 15:43, Brian Cole  wrote:

> Hi Greg,
>
> Are SMARTS searches with the ::qmol type supposed to parallelize? They
> don't appear to be either.
>
> -Brian
>
> On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum 
> wrote:
>
>> Hi Brian,
>>
>> When the new parallel queries came out I checked that they actually could
>> be used and things seemed fine.
>> The problem (and it's a sizable one) is that parallel queries don't use
>> the index. Until parallel scans using GIST indices work, I don't think this
>> is really going to help much.
>>
>> -greg
>>
>>
>> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole  wrote:
>>
>>> It appears like Postgres 9.6+ supports parallel queries now to
>>> accelerate slow queries:
>>> https://www.postgresql.org/docs/10/static/parallel-query.html
>>>
>>> Has anyone successfully got this to accelerate substructure queries with
>>> the RDKit Postgres cartridge?
>>>
>>> Thanks,
>>> Brian
>>>
>>>
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-06-01 Thread Brian Cole
Hi Greg,

Are SMARTS searches with the ::qmol type supposed to parallelize? They
don't appear to be either.

-Brian

On Fri, Jun 1, 2018 at 1:46 AM, Greg Landrum  wrote:

> Hi Brian,
>
> When the new parallel queries came out I checked that they actually could
> be used and things seemed fine.
> The problem (and it's a sizable one) is that parallel queries don't use
> the index. Until parallel scans using GIST indices work, I don't think this
> is really going to help much.
>
> -greg
>
>
> On Fri, Jun 1, 2018 at 12:04 AM Brian Cole  wrote:
>
>> It appears like Postgres 9.6+ supports parallel queries now to accelerate
>> slow queries:
>> https://www.postgresql.org/docs/10/static/parallel-query.html
>>
>> Has anyone successfully got this to accelerate substructure queries with
>> the RDKit Postgres cartridge?
>>
>> Thanks,
>> Brian
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot__
>> _
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] RDKit Postgres Cartridge Parallel Queries?

2018-05-31 Thread Greg Landrum
Hi Brian,

When the new parallel queries came out I checked that they actually could
be used and things seemed fine.
The problem (and it's a sizable one) is that parallel queries don't use the
index. Until parallel scans using GIST indices work, I don't think this is
really going to help much.

-greg


On Fri, Jun 1, 2018 at 12:04 AM Brian Cole  wrote:

> It appears like Postgres 9.6+ supports parallel queries now to accelerate
> slow queries:
> https://www.postgresql.org/docs/10/static/parallel-query.html
>
> Has anyone successfully got this to accelerate substructure queries with
> the RDKit Postgres cartridge?
>
> Thanks,
> Brian
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss