On Wed, 4 Nov 2020 at 18:27, Mark Mackey <m...@cresset-group.com> wrote:
> I did look at Shape-It way back when Silicos open-sourced it: as far as > I can remember the code looked clean enough but it was slow. Unfortunately > from the RDKit point of view it’s LGPL so can’t be used as the basis of an > RDKit shape algorithm. > Hans actually gave permission to relicense an RDKit shape-it port. There is a very old PR that made some progress on this but we never finished it since the performance was just no good. I can’t find that PR/fork at the moment, but if someone thinks that may be a starting point (I’m skeptical) I can look > > Regards, > > Mark > > > > *From:* Chris Swain <sw...@mac.com> > *Sent:* 04 November 2020 15:56 > *To:* rdkit-discuss@lists.sourceforge.net; Mark Mackey < > m...@cresset-group.com> > *Subject:* Re: Rdkit-discuss Digest, Vol 157, Issue 2 > > > > Hi Mark, > > > > Have you ever looked at Optipharm for shape comparison? > > > > https://www.nature.com/articles/s41598-018-37908-6 > > > > Or Shape-it > > > > > http://silicos-it.be.s3-websiteu-west-1.amazonaws.com/software/shape-it/1.0.1/shape-it.html > > > > > > Cheers > > > > Chris > > > > > > > > On 4 Nov 2020, at 14:28, rdkit-discuss-requ...@lists.sourceforge.net > wrote: > > > > From: Mark Mackey <m...@cresset-group.com> > To: Lewis Martin <lewis.marti...@gmail.com>, RDKit Discuss > <rdkit-discuss@lists.sourceforge.net> > Subject: Re: [Rdkit-discuss] GPU Implementation of shape-based 3D > overlap on rdkit? > Message-ID: > < > dbbpr08mb4235128b45e0f546acfc5adb97...@dbbpr08mb4235.eurprd08.prod.outlook.com > > > > Content-Type: text/plain; charset="utf-8" > > Hi Lewis, > > The standard shape alignment algorithm that everyone uses is from Grant & > Pickup 1996 ( > https://onlinelibrary.wiley.com/doi/abs/10.1002/%28SICI%291096-987X%2819961115%2917%3A14%3C1653%3A%3AAID-JCC7%3E3.0.CO%3B2-K > ). > > It?s a Taylor-series-like expansion using spherical Gaussians as stand-ins > for hard spheres - you take the atomic volumes, subtract off the pairwise > overlaps, add back in the three-way overlaps, subtract off the four-way > overlaps, and so on. I did a fair few tests some years back and you really > need to go to 6 terms to get decent accuracy. However, all of the > commercial algorithms (ROCS, Phase Shape, etc) seem to truncate at 2, so go > figure. OTOH the ?high throughput? versions all seem to be operated with > ludicrously low number of conformations so the error in incomplete coverage > of conformer space dwarfs the 5% noise that you get from truncating at 2 > terms rather than 6. > > If you want something slightly more accurate at the same computational > cost, look at WEGA ( > https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23603 and references > therein) which heuristically corrects for some flaws in the truncated > Grant&Pickup calculations. > > If you want a fast GPU-accelerated version, then forget about actually > applying the algorithm directly[*]. Instead, to compare a reference > molecule A to a database molecule B, precompute a grid over A containing > the pairwise overlap value of an atom at each point in the grid with A. You > can then compute the shape overlap for a given orientation of B by a simple > 3D texture lookup rather than faffing around trying to compute exponential > functions.. This is simplified by assuming that all atoms have the same > atomic radius and neglecting hydrogens (we?re going for speed over accuracy > here, remember?) You can get a similar lookup texture for gradients, I > think. One thing GPUs are really good at is texture lookups and > interpolation. They?re less good at evaluating exponential functions. Your > GPU algorithm is then a massively parallel CG or NR optimiser with the > objective function computing shape overlap values for as many molecules as > you can cram into GPU memory all in parallel. > > [*] gWEGA (I believe) is a GPU-accelerated version of the standard WEGA > algorithm and based on the published timings is an order of magnitude or > more slower than fastROCS > > Having said all of that, our GPU-accelerated shape similarity function > just brute forces through the overlap series to sixth order, as (a) my > happy place is on the accuracy side of the speed/accuracy tradeoff, and (b) > our electrostatic similarity calculations are sufficiently complex that > making the shape function faster wouldn?t be that much of a net win. As a > result, take all of the above with a grain of salt ?. > > Regards, > Mark > > -- > Mark Mackey > Chief Scientific Officer > Cresset > New Cambridge House, Bassingbourn Road, Litlington, Cambridgeshire, SG8 > 0SS, UK > tel: +44 (0)1223 858890 mobile: +44 (0)7595 099165 fax: +44 (0)1223 > 853667 > email: m...@cresset-group.com<mailto:m...@cresset-group.com > <m...@cresset-group.com>> web: www.cresset-group.com< > http://www.cresset-group.com/> skype: mark_cresset > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss