Thanks Greg

One silly question. Should the parallelization work "right off the bat" or do I need some special steps in building RDKit ? I'm betting the latter, but can't find what I'm missing (numThreads=4 does nothing for me).

Cheers,
Adam

On 03-Jul-15 5:11, Greg Landrum wrote:
Dear all,

James' question about boost.thread dependencies reminded me that I seem to have neglected to post anything about one of the useful additions to the most recent RDKit release; I'll remedy that now.

I added multi-threading support for a few more time-consuming functions that are embarrassingly parallel: conformation generation, force field minimization of multiple-conformer molecules, and Open3D alignment of multiple-conformer molecules.

Here's a quick demonstration of how this works.

Start by creating a basic molecule:

    In [1]: from rdkit import Chem

    In [2]: from rdkit.Chem import AllChem

    In [3]: m =
    Chem.MolFromSmiles('Cc1cc(C(=O)NCC(=O)N)c(C)n1c2ccc(F)cc2') #
    CHEMBL2113931

    In [4]: mh = Chem.AddHs(m)


Time how long it takes to generate 50 conformers using one thread:

    In [5]: %timeit AllChem.EmbedMultipleConfs(mh,50)
    1 loops, best of 3: 260 ms per loop


And using 4 threads:

    In [6]: %timeit AllChem.EmbedMultipleConfs(mh,50,numThreads=4)
    10 loops, best of 3: 77.6 ms per loop


Nice speed up there.

Do a force-field minimization of all of those conformations with one thread:

    In [7]: %timeit tm = Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm)
    1 loops, best of 3: 1.11 s per loop


And using 4 threads:

    In [8]: %timeit tm =
    Chem.Mol(mh);AllChem.UFFOptimizeMoleculeConfs(tm,numThreads=4)
    1 loops, best of 3: 288 ms per loop


Another good improvement.

Use the O3A code to align all the conformers to another molecule:

    In [16]: m2 =
    Chem.MolFromSmiles(r'Cc1cc(\C=C\2/SC(=Nc3ccccc3)NC2=O)c(C)n1c4ccccc4')
    # CHEMBL599702

    In [17]: m2h = Chem.AddHs(m2)

    In [18]: AllChem.EmbedMolecule(m2h)
    Out[18]: 0

    In [19]: AllChem.UFFOptimizeMolecule(m2h)
    Out[19]: 1

    In [20]: refParams = AllChem.MMFFGetMoleculeProperties(m2h)

    In [21]: prbParams = AllChem.MMFFGetMoleculeProperties(mh)

    In [23]: %timeit
    
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams)
    1 loops, best of 3: 1.13 s per loop


Do the same alignment with 4 threads:

    In [24]: %timeit
    
tm=Chem.Mol(mh);AllChem.GetO3AForProbeConfs(tm,m2h,prbPyMMFFMolProperties=prbParams,refPyMMFFMolProperties=refParams,numThreads=4)
    1 loops, best of 3: 304 ms per loop


I think this is a convenient (and very easy) way to take advantage of one of the convenient features of modern compute hardware.
Best,
-greg



------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/


_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to