Thanks Greg
One silly question. Should the parallelization work "right off the bat"
or do I need some special steps in building RDKit ? I'm betting the
latter, but can't find what I'm missing (numThreads=4 does nothing for me).
Cheers,
Adam
On 03-Jul-15 5:11, Greg Landrum wrote:
Dear all,
James' question about boost.thread dependencies reminded me that I
seem to have neglected to post anything about one of the useful
additions to the most recent RDKit release; I'll remedy that now.
I added multi-threading support for a few more time-consuming
functions that are embarrassingly parallel: conformation generation,
force field minimization of multiple-conformer molecules, and Open3D
alignment of multiple-conformer molecules.
Here's a quick demonstration of how this works.
Start by creating a basic molecule:
In [1]: from rdkit import Chem
In [2]: from rdkit.Chem import AllChem
In [3]: m =
Chem.MolFromSmiles('Cc1cc(C(=O)NCC(=O)N)c(C)n1c2ccc(F)cc2') #
CHEMBL2113931
In [4]: mh = Chem.AddHs(m)
Time how long it takes to generate 50 conformers using one thread:
In [5]: %timeit AllChem.EmbedMultipleConfs(mh,50)
1 loops, best of 3: 260 ms per loop
And using 4 threads:
In [6]: %timeit AllChem.EmbedMultipleConfs(mh,50,numThreads=4)
10 loops, best of 3: 77.6 ms per loop
Nice speed up there.
Do a force-field minimization of all of those conformations with one
thread:
In [7]: %timeit tm = Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm)
1 loops, best of 3: 1.11 s per loop
And using 4 threads:
In [8]: %timeit tm =
Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm,numThreads=4)
1 loops, best of 3: 288 ms per loop
Another good improvement.
Use the O3A code to align all the conformers to another molecule:
In [16]: m2 =
Chem.MolFromSmiles(r'Cc1cc(\C=C\2/SC(=Nc3ccccc3)NC2=O)c(C)n1c4ccccc4')
# CHEMBL599702
In [17]: m2h = Chem.AddHs(m2)
In [18]: AllChem.EmbedMolecule(m2h)
Out[18]: 0
In [19]: AllChem.UFFOptimizeMolecule(m2h)
Out[19]: 1
In [20]: refParams = AllChem.MMFFGetMoleculeProperties(m2h)
In [21]: prbParams = AllChem.MMFFGetMoleculeProperties(mh)
In [23]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams)
1 loops, best of 3: 1.13 s per loop
Do the same alignment with 4 threads:
In [24]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams,numThreads=4)
1 loops, best of 3: 304 ms per loop
I think this is a convenient (and very easy) way to take advantage of
one of the convenient features of modern compute hardware.
Best,
-greg
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss