Hi Chris,

I haven't looked at Optipharm: from a quick read through the paper it's 
basically WEGA but with a different optimiser on the front. Looks like an 
interesting idea and they seem to have a way of extending it to electrostatics 
as well 
(https://chemrxiv.org/articles/preprint/Optimizing_Electrostatic_Similarity_for_Virtual_Screening_A_New_Methodology/10044272/1).
 There's no code, so from an RDKit perspective we'd be reimplementing it from 
the description in the paper.

I did look at Shape-It way back when Silicos open-sourced it: as far as I can 
remember the code looked clean enough but it was slow. Unfortunately from the 
RDKit point of view it's LGPL so can't be used as the basis of an RDKit shape 
algorithm.

Regards,
Mark

From: Chris Swain <sw...@mac.com>
Sent: 04 November 2020 15:56
To: rdkit-discuss@lists.sourceforge.net; Mark Mackey <m...@cresset-group.com>
Subject: Re: Rdkit-discuss Digest, Vol 157, Issue 2

Hi Mark,

Have you ever looked at Optipharm for shape comparison?

https://www.nature.com/articles/s41598-018-37908-6

Or Shape-it

http://silicos-it.be.s3-websiteu-west-1.amazonaws.com/software/shape-it/1.0.1/shape-it.html


Cheers

Chris




On 4 Nov 2020, at 14:28, 
rdkit-discuss-requ...@lists.sourceforge.net<mailto:rdkit-discuss-requ...@lists.sourceforge.net>
 wrote:

From: Mark Mackey <m...@cresset-group.com<mailto:m...@cresset-group.com>>
To: Lewis Martin <lewis.marti...@gmail.com<mailto:lewis.marti...@gmail.com>>, 
RDKit Discuss
              
<rdkit-discuss@lists.sourceforge.net<mailto:rdkit-discuss@lists.sourceforge.net>>
Subject: Re: [Rdkit-discuss] GPU Implementation of shape-based 3D
              overlap on rdkit?
Message-ID:
<dbbpr08mb4235128b45e0f546acfc5adb97...@dbbpr08mb4235.eurprd08.prod.outlook.com<mailto:dbbpr08mb4235128b45e0f546acfc5adb97...@dbbpr08mb4235.eurprd08.prod.outlook.com>>

Content-Type: text/plain; charset="utf-8"

Hi Lewis,

The standard shape alignment algorithm that everyone uses is from Grant & 
Pickup 1996 
(https://onlinelibrary.wiley.com/doi/abs/10.1002/%28SICI%291096-987X%2819961115%2917%3A14%3C1653%3A%3AAID-JCC7%3E3.0.CO%3B2-K).

It?s a Taylor-series-like expansion using spherical Gaussians as stand-ins for 
hard spheres - you take the atomic volumes, subtract off the pairwise overlaps, 
add back in the three-way overlaps, subtract off the four-way overlaps, and so 
on. I did a fair few tests some years back and you really need to go to 6 terms 
to get decent accuracy. However, all of the commercial algorithms (ROCS, Phase 
Shape, etc) seem to truncate at 2, so go figure. OTOH the ?high throughput? 
versions all seem to be operated with ludicrously low number of conformations 
so the error in incomplete coverage of conformer space dwarfs the 5% noise that 
you get from truncating at 2 terms rather than 6.

If you want something slightly more accurate at the same computational cost, 
look at WEGA (https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23603 and 
references therein) which heuristically corrects for some flaws in the 
truncated Grant&Pickup calculations.

If you want a fast GPU-accelerated version, then forget about actually applying 
the algorithm directly[*]. Instead, to compare a reference molecule A to a 
database molecule B, precompute a grid over A containing the pairwise overlap 
value of an atom at each point in the grid with A. You can then compute the 
shape overlap for a given orientation of B by a simple 3D texture lookup rather 
than faffing around trying to compute exponential functions.. This is 
simplified by assuming that all atoms have the same atomic radius and 
neglecting hydrogens (we?re going for speed over accuracy here, remember?) You 
can get a similar lookup texture for gradients, I think. One thing GPUs are 
really good at is texture lookups and interpolation. They?re less good at 
evaluating exponential functions. Your GPU algorithm is then a massively 
parallel CG or NR optimiser with the objective function computing shape overlap 
values for as many molecules as you can cram into GPU memory all in parallel.

[*] gWEGA (I believe) is a GPU-accelerated version of the standard WEGA 
algorithm and based on the published timings is an order of magnitude or more 
slower than fastROCS

Having said all of that, our GPU-accelerated shape similarity function just 
brute forces through the overlap series to sixth order, as (a) my happy place 
is on the accuracy side of the speed/accuracy tradeoff, and (b) our 
electrostatic similarity calculations are sufficiently complex that making the 
shape function faster wouldn?t be that much of a net win. As a result, take all 
of the above with a grain of salt ?.

Regards,
Mark

--
Mark Mackey
Chief Scientific Officer
Cresset
New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 0SS, UK
tel: +44 (0)1223 858890    mobile: +44 (0)7595 099165    fax: +44 (0)1223 853667
email: 
m...@cresset-group.com<mailto:m...@cresset-group.com><mailto:m...@cresset-group.com>
    web: 
www.cresset-group.com<http://www.cresset-group.com/><http://www.cresset-group.com/>
    skype: mark_cresset

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to