I have run into a problem with using the RDKit to generate conformers of 
molecules. I am using the following code:

from rdkit import Chem
from rdkit.Chem import AllChem

from timeit import default_timer as timer

def GenerateDGConfs(m,num_confs,rms):
    start_time = timer()
    ids = AllChem.EmbedMultipleConfs(m, numConfs=num_confs, pruneRmsThresh=rms, 
    for id in ids:
        AllChem.MMFFOptimizeMolecule(m, confId=id)

    end_time = timer()
    time_diff = end_time - start_time
#    print ("Normal DG = %0.2f" % time_diff)

    return m, list(ids), time_diff

w = Chem.SDWriter("%s/%s" % (rootdir,"My_conformers.sdf))

suppl = Chem.SDMolSupplier("%s/%s" % (rootdir,"My_molecules.sdf"))
num_confs = 200
rmsd = 0.5

for mol in suppl:

    if mol is None: continue

    mol1 = Chem.AddHs(mol)

    conf_mol, id_list, time_diff = GenerateDGConfs(mol1,num_confs,rmsd)
    num_confs = conf_mol.GetNumConformers()
    for id in id_list:
        w.write(conf_mol, confId=id)


What I see from this is as I go through the molecules in the input file the 
number of conformers returned declines monotonically, starting close to the 200 
I set as a maximum to around 10 after a few thousand molecules have been 
processed (this applies whether I use 'normal' DG or the ETKDG method. As I am 
a new user of RDKit I am sure I missed something obvious but I cannot see it.

Also, once I generate the conformers what is best way to cluster them by RMSD 
so that each conformer has a minimum RMSD to all the others in the set?

Any help would be gratefully received.

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Rdkit-discuss mailing list

Reply via email to