Hi Andy, If -1 is used for the random number seed, the RDKit will use the current date (including seconds) as seed (Greg, please correct me if I’m wrong). Therefore, you get a different seed every time you run the script. If you use a fixed seed, you will generate the same conformations every time you run it. Note that if pruneRMSthresh > 0, the generated conformers will be pruned, i.e. conformers with a RMS < cutoff to any previous conformer will be discarded. As this happens at the very end of the conformer generation routine, no additional conformers will be generated to replace the discarded ones. This is why you get a varying number of conformers.
I have run your script and I get the same weird third conformation. This should certainly not happen. I will look into it. Best, Sereina > On 12 Jan 2018, at 19:17, Andy Jennings <andy.j.jenni...@gmail.com> wrote: > > Hi RDKitters, > > Whilst looking at generating some conformations of molecules using the ETKDG > method with EmbedMultipleConfs I've come across some strange (to me) behavior. > > When I generate conformations of some molecules with the randomSeed as -1 the > result is a variable number of conformations. That's not the strangest aspect > though - some of the conformations are quite bizarre based upon any geometry > rules I can think of. However, when the randomSeed is set to a fixed number > the odd behavior goes away and I get only reasonable conformations. > > To illustrate here is some code (please no criticism of my terrible style!): > > ### CODE ### > from rdkit import Chem > from rdkit.Chem import AllChem > import sys > > acamide = Chem.MolFromSmiles('O=C(NC=C)c1ccccc1') > ETKDG = 1 > _seed = -1 > m = Chem.AddHs(acamide) > n = 3 > ps = AllChem.ETKDG() > ps.pruneRmsThresh = 0.5 > ps.numThreads = 0 > ps.randomSeed = _seed > fixIt = 0 > for i in range(0,100): > ids = AllChem.EmbedMultipleConfs(m, n, ps) > if fixIt: > for _id in ids: AllChem.UFFOptimizeMolecule(m, confId = _id) > sys.stderr.write('%d,' % len(ids)) > if len(ids) > 2: > outStream = Chem.SDWriter('test.sdf') > for _id in ids: > outStream.write(m,confId = _id) > outStream.flush() > outStream.close() > sys.stderr.write('\n') > break > > ### END CODE ### > > > This takes the smiles string for a simple acrylamide and generates a max of 3 > conformations for the molecule. The loop runs 100 times and halts when 3 > conformations are found - which is the sign of a bad conformation being > generated. When I run this the number of conformations generated each time > varies between 1-3 and it does so differently from run to run. > > For instance: > run #1: > 2,2,1,1,2,2,2,2,2,2,1,2,2,1,2,1,2,1,2,2,1,2,1,1,1,2,2,2,2,2,1,2,2,2,2,2,2,2,1,2,2,1,2,2,2,2,1,1,2,2,3, > run #2: 2,1,2,2,2,1,1,3, > run #3: 2,2,2,1,2,2,2,2,1,2,2,1,2,1,2,2,3, > and so on > > When I visually inspect test.sdf that results from a generation of 3 > conformers I find that one of the conformations has a very odd amide nitrogen > geometry - almost linear between the heavy atoms. > > If I change _seed to a number such as '1' I get a single conformation for > every run. > > If I implement the UFF optimization (with fixIt = 1) then I'll still get > multiple conformations but they all look reasonable. > > So, I'm not sure if there is some systematic problem here or I'm just failing > to understand the appropriate way to implement this form of conformational > search. Any insights are welcome. > > Best, > Andy > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! > http://sdm.link/slashdot_______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss