Hi RDKitters,

Whilst looking at generating some conformations of molecules using the
ETKDG method with EmbedMultipleConfs I've come across some strange (to me)

When I generate conformations of some molecules with the randomSeed as -1
the result is a variable number of conformations. That's not the strangest
aspect though - some of the conformations are quite bizarre based upon any
geometry rules I can think of. However, when the randomSeed is set to a
fixed number the odd behavior goes away and I get only reasonable

To illustrate here is some code (please no criticism of my terrible style!):

### CODE ###
from rdkit import Chem
from rdkit.Chem import AllChem
import sys

acamide = Chem.MolFromSmiles('O=C(NC=C)c1ccccc1')
_seed = -1
m = Chem.AddHs(acamide)
n = 3
ps = AllChem.ETKDG()
ps.pruneRmsThresh = 0.5
ps.numThreads = 0
ps.randomSeed = _seed
fixIt = 0
for i in range(0,100):
    ids = AllChem.EmbedMultipleConfs(m, n, ps)
    if fixIt:
        for _id in ids: AllChem.UFFOptimizeMolecule(m, confId = _id)
    sys.stderr.write('%d,' % len(ids))
    if len(ids) > 2:
        outStream = Chem.SDWriter('test.sdf')
        for _id in ids:
            outStream.write(m,confId = _id)

### END CODE ###

This takes the smiles string for a simple acrylamide and generates a max of
3 conformations for the molecule. The loop runs 100 times and halts when 3
conformations are found - which is the sign of a bad conformation being
generated. When I run this the number of conformations generated each time
varies between 1-3 and it does so differently from run to run.

For instance:
run #1:
run #2: 2,1,2,2,2,1,1,3,
run #3: 2,2,2,1,2,2,2,2,1,2,2,1,2,1,2,2,3,
and so on

When I visually inspect test.sdf that results from a generation of 3
conformers I find that one of the conformations has a very odd amide
nitrogen geometry - almost linear between the heavy atoms.

If I change _seed to a number such as '1' I get a single conformation for
every run.

If I implement the UFF optimization (with fixIt = 1) then I'll still get
multiple conformations but they all look reasonable.

So, I'm not sure if there is some systematic problem here or I'm just
failing to understand the appropriate way to implement this form of
conformational search. Any insights are welcome.

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Rdkit-discuss mailing list

Reply via email to