Dear all,
James' question about boost.thread dependencies reminded me that I seem to
have neglected to post anything about one of the useful additions to the
most recent RDKit release; I'll remedy that now.
I added multi-threading support for a few more time-consuming functions
that are embarrassingly parallel: conformation generation, force field
minimization of multiple-conformer molecules, and Open3D alignment of
multiple-conformer molecules.
Here's a quick demonstration of how this works.
Start by creating a basic molecule:
In [1]: from rdkit import Chem
In [2]: from rdkit.Chem import AllChem
In [3]: m = Chem.MolFromSmiles('Cc1cc(C(=O)NCC(=O)N)c(C)n1c2ccc(F)cc2') #
CHEMBL2113931
In [4]: mh = Chem.AddHs(m)
Time how long it takes to generate 50 conformers using one thread:
In [5]: %timeit AllChem.EmbedMultipleConfs(mh,50)
1 loops, best of 3: 260 ms per loop
And using 4 threads:
In [6]: %timeit AllChem.EmbedMultipleConfs(mh,50,numThreads=4)
10 loops, best of 3: 77.6 ms per loop
Nice speed up there.
Do a force-field minimization of all of those conformations with one thread:
In [7]: %timeit tm = Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm)
1 loops, best of 3: 1.11 s per loop
And using 4 threads:
In [8]: %timeit tm =
Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm,numThreads=4)
1 loops, best of 3: 288 ms per loop
Another good improvement.
Use the O3A code to align all the conformers to another molecule:
In [16]: m2 =
Chem.MolFromSmiles(r'Cc1cc(\C=C\2/SC(=Nc3ccccc3)NC2=O)c(C)n1c4ccccc4') #
CHEMBL599702
In [17]: m2h = Chem.AddHs(m2)
In [18]: AllChem.EmbedMolecule(m2h)
Out[18]: 0
In [19]: AllChem.UFFOptimizeMolecule(m2h)
Out[19]: 1
In [20]: refParams = AllChem.MMFFGetMoleculeProperties(m2h)
In [21]: prbParams = AllChem.MMFFGetMoleculeProperties(mh)
In [23]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams)
1 loops, best of 3: 1.13 s per loop
Do the same alignment with 4 threads:
In [24]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams,numThreads=4)
1 loops, best of 3: 304 ms per loop
I think this is a convenient (and very easy) way to take advantage of one
of the convenient features of modern compute hardware.
Best,
-greg
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss