Re: [Rdkit-discuss] Conformer generation does not sample well?
These papers may be relevant: - http://pubs.acs.org/doi/full/10.1021/ci2004658 - https://jcheminf.springeropen.com/articles/10.1186/1758-2946-3-4 On Wed, Jun 29, 2016 at 1:29 AM, Sereinawrote: > Hi Tim, > > I had a look at the 1DWD example and I detail in the following the > analysis I did as it may be useful to other users. > > The conformer generation function has the option to print the experimental > torsion preferences that were used in the generation: > printExpTorsionAngles=True > > For 1DWD, it gives the following output: > > > ref = Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@ > @H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1’) > > mol1 = > Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb') > > mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1) > > mol1 = Chem.AddHs(mol1) > > numConfs = 100 > > cids = AllChem.EmbedMultipleConfs(mol1, numConfs=numConfs, > useExpTorsionAnglePrefs=True, useBasicKnowledge=True, > printExpTorsionAngles=True) > O=[S:1](=O)[NX3H1:2]!@;-[CX4H2:3][!#1:4]: 0 13 14 15, (17.9, -13.3, 9.2, > 4.7, -2.3, -0.9) > O=[C:1][NX3H1:2]!@;-[CX4H1:3][H:4]: 15 17 18 48, (5, 0, 0, 0, 0, 0) > [O:1]=[CX3:2]!@;-[NX3H0:3][!#1:4]: 20 19 31 32, (0, 8, 0, 0, 0, 0) > [O:1]=[CX3:2]!@;-[NX3H1:3][!#1:4]: 16 15 17 18, (100, 0, 0, 0, 0, 0) > [c:1][S:2](=O)(=O)!@;-[NX3H1:3][C:4]: 4 0 13 14, (0, 16, 5, 7, 0, 0) > [aH1:1][c:2]([aH1])!@;-[SX4:3][!#1:4]: 3 4 0 13, (0, 1.5, 0, 0, 0, 0) > N[C:2](=[O:1])!@;-[CH2:3][N:4]: 16 15 14 13, (0, 10, 0, 0, 0, 0) > [!#1:1][CX4:2]!@;-[CX4:3][!#1:4]: 17 18 21 22, (0, 0, 7, 0, 0, 0) > [NH1:1][CX4:2]!@;-[CX3:3]=[O:4]: 17 18 19 20, (0, 2.1, -0.1, -1, -0.4, > -0.1) > [cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4]: 23 22 21 18, (0, 3.6, 0, 0, 0, 0) > [a:1][c:2]!@;-[C:3](=[N:4]): 25 27 28 30, (0, 5, 0, 0, 0, 0) > [*:1][X3,X2:2]=[X3,X2:3][*:4]: 27 28 30 57, (0, 100, 0, 0, 0, 0) > > The output is structured as follows: First comes the SMARTS pattern, then > the indices of the four atoms involved and finally the six force constants > K_1, K_2, K_3, K_4, K_5, K_6 for the torsion potential (see Eq. (2) in > JCIM, 55, 2562, 2015). > The position of the non-zero force constant tells you the multiplicity i > of the cosine. E.g. for the torsion between atoms 18 and 21 the third force > constant is 7.0 and all others are zero, thus the torsion potential for > this bond has a multiplicity i = 3 (i.e. three maxima). The information > about the phase shift is not exposed, but we can change that if people find > it useful. > Currently to get the full information about the torsion potential, you > need to search for the SMARTS pattern in > Code/GraphMol/ForceFieldHelpers/CrystalFF/TorsionPreferences.cpp > There you find the line: "[!#1:1][CX4:2]!@;-[CX4:3][!#1:4] 1 0.0 1 0.0 1 > 7.0 1 0.0 1 0.0 1 0.0\n" > Here we have twelve parameters, always two for a given multiplicity: first > the phase shift (can be 1 or -1) and second the force constant. > For this particular torsion pattern we have a multiplicity of 3, a phase > shift of cos(d) = 1 and a force constant K_3 = 7.0, which corresponds to > three peaks at 60°, 180° and 300° in the range [0°, 360°] (or -60°, 60° and > 180° in the range [-180°, 180°]). > > The two bonds connecting the benzamidine ring to the rest of the molecule > are between atoms 22/21 and 21/18. > I calculated the signed dihedral angle for these two bonds, here is the > code for the second one: > > > angles = [] > > for cid in cids: > >conf = mol1.GetConformer(id=cid) > ># torsion of atoms 17 18 21 22 > >p1 = conf.GetAtomPosition(17) > >p2 = conf.GetAtomPosition(18) > >p3 = conf.GetAtomPosition(21) > >p4 = conf.GetAtomPosition(22) > >a = Geometry.ComputeSignedDihedralAngle(p1, p2, p3, p4)/math.pi * > 180.0 > >angles.append(a) > > For 100 confs, this gives the following distribution: > So, the torsion potential is sampled as it is supposed to do. > > > For the other bond between atmos 22/21, we have the following line in > TorsionPreferences.cpp: > "[cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4] 1 0.0 1 3.6 1 0.0 1 0.0 1 0.0 1 > 0.0\n” > This means, we have a multiplicity of 2, a phase shift cos(d) = 1 and a > force constant of 3.6, which corresponds to two peaks at -120° and 120° in > the range [-180°, 180°]. > The histogram looks like: > Again, the potential is largely sampled as it’s supposed to be. > > Now, let’s have a look at the two bonds connecting atom 18 to the other > two branches. > For the bond between atoms 18/17, we should have a single peak at 0°, > which we get most of the time > > The bond between atoms 18/19 has a multi-term potential, which is a bit > more difficult to interpret from the numbers alone. > "[NH1:1][CX4:2]!@;-[CX3:3]=[O:4] 1 0.0 -1 2.1 -1 -0.1 -1 -1.0 1 -0.4 1 > -0.1\n" > It has three peaks at -150°, 0° and 150° in the range [-180°, 180°]. The > peak at 0° is broad. > The histogram looks like: > For the 100 conformers, the peaks at -150° and 150°
Re: [Rdkit-discuss] Conformer generation does not sample well?
Hi Tim, I had a look at the 1DWD example and I detail in the following the analysis I did as it may be useful to other users. The conformer generation function has the option to print the experimental torsion preferences that were used in the generation: printExpTorsionAngles=True For 1DWD, it gives the following output: > ref = > Chem.MolFromSmiles('NC(=[NH2+])c1ccc(C[C@@H](NC(=O)CNS(=O)(=O)c2ccc3c3c2)C(=O)N2C2)cc1’) > mol1 = > Chem.MolFromPDBFile(RDConfig.RDBaseDir+'/rdkit/Chem/test_data/1DWD_ligand.pdb') > mol1 = AllChem.AssignBondOrdersFromTemplate(ref, mol1) > mol1 = Chem.AddHs(mol1) > numConfs = 100 > cids = AllChem.EmbedMultipleConfs(mol1, numConfs=numConfs, > useExpTorsionAnglePrefs=True, useBasicKnowledge=True, > printExpTorsionAngles=True) O=[S:1](=O)[NX3H1:2]!@;-[CX4H2:3][!#1:4]: 0 13 14 15, (17.9, -13.3, 9.2, 4.7, -2.3, -0.9) O=[C:1][NX3H1:2]!@;-[CX4H1:3][H:4]: 15 17 18 48, (5, 0, 0, 0, 0, 0) [O:1]=[CX3:2]!@;-[NX3H0:3][!#1:4]: 20 19 31 32, (0, 8, 0, 0, 0, 0) [O:1]=[CX3:2]!@;-[NX3H1:3][!#1:4]: 16 15 17 18, (100, 0, 0, 0, 0, 0) [c:1][S:2](=O)(=O)!@;-[NX3H1:3][C:4]: 4 0 13 14, (0, 16, 5, 7, 0, 0) [aH1:1][c:2]([aH1])!@;-[SX4:3][!#1:4]: 3 4 0 13, (0, 1.5, 0, 0, 0, 0) N[C:2](=[O:1])!@;-[CH2:3][N:4]: 16 15 14 13, (0, 10, 0, 0, 0, 0) [!#1:1][CX4:2]!@;-[CX4:3][!#1:4]: 17 18 21 22, (0, 0, 7, 0, 0, 0) [NH1:1][CX4:2]!@;-[CX3:3]=[O:4]: 17 18 19 20, (0, 2.1, -0.1, -1, -0.4, -0.1) [cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4]: 23 22 21 18, (0, 3.6, 0, 0, 0, 0) [a:1][c:2]!@;-[C:3](=[N:4]): 25 27 28 30, (0, 5, 0, 0, 0, 0) [*:1][X3,X2:2]=[X3,X2:3][*:4]: 27 28 30 57, (0, 100, 0, 0, 0, 0) The output is structured as follows: First comes the SMARTS pattern, then the indices of the four atoms involved and finally the six force constants K_1, K_2, K_3, K_4, K_5, K_6 for the torsion potential (see Eq. (2) in JCIM, 55, 2562, 2015). The position of the non-zero force constant tells you the multiplicity i of the cosine. E.g. for the torsion between atoms 18 and 21 the third force constant is 7.0 and all others are zero, thus the torsion potential for this bond has a multiplicity i = 3 (i.e. three maxima). The information about the phase shift is not exposed, but we can change that if people find it useful. Currently to get the full information about the torsion potential, you need to search for the SMARTS pattern in Code/GraphMol/ForceFieldHelpers/CrystalFF/TorsionPreferences.cpp There you find the line: "[!#1:1][CX4:2]!@;-[CX4:3][!#1:4] 1 0.0 1 0.0 1 7.0 1 0.0 1 0.0 1 0.0\n" Here we have twelve parameters, always two for a given multiplicity: first the phase shift (can be 1 or -1) and second the force constant. For this particular torsion pattern we have a multiplicity of 3, a phase shift of cos(d) = 1 and a force constant K_3 = 7.0, which corresponds to three peaks at 60°, 180° and 300° in the range [0°, 360°] (or -60°, 60° and 180° in the range [-180°, 180°]). The two bonds connecting the benzamidine ring to the rest of the molecule are between atoms 22/21 and 21/18. I calculated the signed dihedral angle for these two bonds, here is the code for the second one: > angles = [] > for cid in cids: >conf = mol1.GetConformer(id=cid) ># torsion of atoms 17 18 21 22 >p1 = conf.GetAtomPosition(17) >p2 = conf.GetAtomPosition(18) >p3 = conf.GetAtomPosition(21) >p4 = conf.GetAtomPosition(22) >a = Geometry.ComputeSignedDihedralAngle(p1, p2, p3, p4)/math.pi * 180.0 >angles.append(a) For 100 confs, this gives the following distribution: So, the torsion potential is sampled as it is supposed to do. For the other bond between atmos 22/21, we have the following line in TorsionPreferences.cpp: "[cH1:1][c:2]([cH1])!@;-[CX4H2:3][CX4:4] 1 0.0 1 3.6 1 0.0 1 0.0 1 0.0 1 0.0\n” This means, we have a multiplicity of 2, a phase shift cos(d) = 1 and a force constant of 3.6, which corresponds to two peaks at -120° and 120° in the range [-180°, 180°]. The histogram looks like: Again, the potential is largely sampled as it’s supposed to be. Now, let’s have a look at the two bonds connecting atom 18 to the other two branches. For the bond between atoms 18/17, we should have a single peak at 0°, which we get most of the time The bond between atoms 18/19 has a multi-term potential, which is a bit more difficult to interpret from the numbers alone. "[NH1:1][CX4:2]!@;-[CX3:3]=[O:4] 1 0.0 -1 2.1 -1 -0.1 -1 -1.0 1 -0.4 1 -0.1\n" It has three peaks at -150°, 0° and 150° in the range [-180°, 180°]. The peak at 0° is broad. The histogram looks like: For the 100 conformers, the peaks at -150° and 150° are not sampled. I therefore generated 1000 conformers, but the sampling does not really improve: The algorithm of ETKDG takes a randomly generated conformer from standard distance geometry and minimizes it with the ET and K terms. For each bond with an exp. torsion term, the torsional angle will be minimized to the closest minima. In other words, ETKDG relies
[Rdkit-discuss] Conformer generation does not sample well?
This topic (https://sourceforge.net/p/rdkit/mailman/message/35173301/) discussed using conformer generation as input into Open3DAlign. One thing I noticed is that the conformer generation (using the useExpTorsionAnglePrefs=True and useBasicKnowledge=True options) does not generate conformers that align well for this example. The input is based on the 1DWD_ligand.pdb structure in the RDKit distro. What I find is the the rotation of the benzamidine ring is never in the right place for alignment (the other two ring systems align well). This is when generating up to 10,000 conformers. Does this suggest that the conformer generation does not sample conformational space very effectively? Are there options for improving this? Thanks Tim -- Attend Shape: An AT Tech Expo July 15-16. Meet us at AT Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss