Re: [Rdkit-discuss] Conformational search not "converging" to low energy conformation

2017-06-12 Thread Brian Kelley
There shouldn't be any expectation that the MMFF energy should converge to
1kcal/mol.

There *may* be a case to be made for the MMFF_energy - Global_minimum be <
1kcal/mol, however, in general, we don't know what the global minimum is.
I suggest looking at this paper:

https://www.ncbi.nlm.nih.gov/pubmed/15115393

This paper describes a basic technique for analyzing conformational
strain.  It is a bit out of date, however their rule of thumb for MMFF is
that most molecules in their bioactive form will be < 5kcal/mol but many
may be > 9kcal when compared against the sampled global minimum.

Two issues: (1) RDKIT doesn't make a regular sampling of conformations, it
relies on random embeddings.  I have no idea how this may effect finding
the global minim.  (2) Another issue is that the RDKIT molecule may have
slight deviations from MMFF angles and bond lengths that may artificially
make the results look worse.  We could solve this by optimizing in a
harmonic well of, say .5 angstroms to shrug off these high energies prior
to computing the actual MMFF energy.

I hope this helps,

-Brian

On Mon, Jun 12, 2017 at 2:23 PM, Jan Halborg Jensen 
wrote:

> The code below shows the lowest energy found for 6 different protomers
> defined by the smiles strings below as a function of number of conformers.
> Even with 2000 conformers I am not getting convergence to within 1 kcal/mol
> for comp109_1=2.
>
> Is this expected? Any advice or tips appreciated
>
> 200 800 1000 2500
> comp109_0=0 0.16 0.16 0.16 0.16
> comp109_1=1 -25.18 -24.43 -25.08 -25.18
> comp109_1=2 -16.42 -16.24 -16.21 -15.15
> comp109_0=3 -24.05 -24.16 -24.09 -23.96
> comp109_1=5 -38.4 -37.38 -38.28 -38.32
> comp109_2=8 0.18 1.38 -0.24 0.08
>
> Code
>
> import sys
> from rdkit import Chem
> from rdkit.Chem import AllChem
>
> confs = 2500
> e_cut = 20.0
> decimals_in_energies = 2
>
> filename = sys.argv[1]
> file = open(filename, "r")
>
> for line in file:
> words = line.split()
> name = words[0]
> smiles = words[1]
>
> m = Chem.AddHs(Chem.MolFromSmiles(smiles))
>
> AllChem.EmbedMultipleConfs(m,numConfs=confs)
> AllChem.MMFFOptimizeMoleculeConfs(m,numThreads=8,maxIters=1000,
> mmffVariant="MMFF94")
>
> energies = []
> for conf in m.GetConformers():
> tm = Chem.Mol(m,False,conf.GetId())
> prop = AllChem.MMFFGetMoleculeProperties(tm, mmffVariant="MMFF94")
> ff =AllChem.MMFFGetMoleculeForceField(tm,prop)
> energies.append(round(ff.CalcEnergy(),decimals_in_energies))
>
> e_min = min(energies)
> print e_min
>
>
> Smiles file
> comp109_0=0 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(
> N)=N1
> comp109_1=1 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=
> [NH2+])N1
> comp109_1=2 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)
> OC(N)=N1
> comp109_0=3 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=
> N)N1
> comp109_1=5 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)
> OC(=N)N1
> comp109_2=8 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)
> OC(=[NH2+])N1
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] Conformational search not "converging" to low energy conformation

2017-06-12 Thread Jan Halborg Jensen
The code below shows the lowest energy found for 6 different protomers defined 
by the smiles strings below as a function of number of conformers. Even with 
2000 conformers I am not getting convergence to within 1 kcal/mol for 
comp109_1=2.

Is this expected? Any advice or tips appreciated

200 800 10002500
comp109_0=0 0.160.160.160.16
comp109_1=1 -25.18  -24.43  -25.08  -25.18
comp109_1=2 -16.42  -16.24  -16.21  -15.15
comp109_0=3 -24.05  -24.16  -24.09  -23.96
comp109_1=5 -38.4   -37.38  -38.28  -38.32
comp109_2=8 0.181.38-0.24   0.08

Code

import sys
from rdkit import Chem
from rdkit.Chem import AllChem

confs = 2500
e_cut = 20.0
decimals_in_energies = 2

filename = sys.argv[1]
file = open(filename, "r")

for line in file:
words = line.split()
name = words[0]
smiles = words[1]

m = Chem.AddHs(Chem.MolFromSmiles(smiles))

AllChem.EmbedMultipleConfs(m,numConfs=confs)

AllChem.MMFFOptimizeMoleculeConfs(m,numThreads=8,maxIters=1000,mmffVariant="MMFF94")

energies = []
for conf in m.GetConformers():
tm = Chem.Mol(m,False,conf.GetId())
prop = AllChem.MMFFGetMoleculeProperties(tm, mmffVariant="MMFF94")
ff =AllChem.MMFFGetMoleculeForceField(tm,prop)
energies.append(round(ff.CalcEnergy(),decimals_in_energies))

e_min = min(energies)
print e_min


Smiles file
comp109_0=0 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(N)=N1
comp109_1=1 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=[NH2+])N1
comp109_1=2 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(N)=N1
comp109_0=3 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)cn3)ccc2F)C[C@H](C(F)(F)F)OC(=N)N1
comp109_1=5 C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(=N)N1
comp109_2=8 
C[C@@]1(c2cc(NC(=O)c3ccc(C#N)c[nH+]3)ccc2F)C[C@H](C(F)(F)F)OC(=[NH2+])N1
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss