[Rdkit-discuss] new parallel execution features in the 2015.03 RDKit release

Greg Landrum Thu, 02 Jul 2015 20:12:57 -0700

Dear all,

James' question about boost.thread dependencies reminded me that I seem to
have neglected to post anything about one of the useful additions to the
most recent RDKit release; I'll remedy that now.


I added multi-threading support for a few more time-consuming functions
that are embarrassingly parallel: conformation generation, force field
minimization of multiple-conformer molecules, and Open3D alignment of
multiple-conformer molecules.

Here's a quick demonstration of how this works.

Start by creating a basic molecule:

In [1]: from rdkit import Chem

In [2]: from rdkit.Chem import AllChem

In [3]: m = Chem.MolFromSmiles('Cc1cc(C(=O)NCC(=O)N)c(C)n1c2ccc(F)cc2') #
CHEMBL2113931

In [4]: mh = Chem.AddHs(m)


Time how long it takes to generate 50 conformers using one thread:

In [5]: %timeit AllChem.EmbedMultipleConfs(mh,50)
1 loops, best of 3: 260 ms per loop


And using 4 threads:

In [6]: %timeit AllChem.EmbedMultipleConfs(mh,50,numThreads=4)
10 loops, best of 3: 77.6 ms per loop


Nice speed up there.

Do a force-field minimization of all of those conformations with one thread:

In [7]: %timeit tm = Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm)
1 loops, best of 3: 1.11 s per loop


And using 4 threads:

In [8]: %timeit tm =
Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm,numThreads=4)
1 loops, best of 3: 288 ms per loop


Another good improvement.

Use the O3A code to align all the conformers to another molecule:

In [16]: m2 =
Chem.MolFromSmiles(r'Cc1cc(\C=C\2/SC(=Nc3ccccc3)NC2=O)c(C)n1c4ccccc4') #
CHEMBL599702

In [17]: m2h = Chem.AddHs(m2)

In [18]: AllChem.EmbedMolecule(m2h)
Out[18]: 0

In [19]: AllChem.UFFOptimizeMolecule(m2h)
Out[19]: 1

In [20]: refParams = AllChem.MMFFGetMoleculeProperties(m2h)

In [21]: prbParams = AllChem.MMFFGetMoleculeProperties(mh)

In [23]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams)
1 loops, best of 3: 1.13 s per loop


Do the same alignment with 4 threads:

In [24]: %timeit
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams,numThreads=4)
1 loops, best of 3: 304 ms per loop


I think this is a convenient (and very easy) way to take advantage of one
of the convenient features of modern compute hardware.

Best,
-greg

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] new parallel execution features in the 2015.03 RDKit release

Reply via email to