Re: [Rdkit-discuss] Implicit Hydrogens On Aromatic Hetereoatoms
Greg, That fixed it! Thanks so much, that makes a lot more sense now. -Chris On Fri, Oct 6, 2017 at 1:25 AM, Greg Landrum wrote: > Hi Chris, > > There's an additional step performed during sanitization that recognizes > that the implicit H needs to be on the N. The steps of a normal full > molecular sanitization operation are documented here: > http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization > > The adjustHs() function is not exposed directly to Python, but you can > take care of aromaticity assignment and adjustHs in a single call with the > SanitizeMol() function: > > In [21]: m = Chem.MolFromMolBlock(mb,sanitize=False) > > In [22]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_SETAROMATICITY| > Chem.SANITIZE_ADJUSTHS) > Out[22]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE > > In [23]: Chem.MolToSmiles(m) > Out[23]: 'CC(C)c12c(Cl)cc(-c3ccc(-c4ccc(C(=O)O)cc4)[nH]3)nc12' > > I highlighted the N in the heterocyle with the implicit H. > > Hopefully this helps. > > Best, > -greg > p.s. Note: while testing parts of this answer I uncovered a bug in > AdjustHs() that causes it to fail for molecules that include atoms with > "bad valences": https://github.com/rdkit/rdkit/issues/1605 This looks > like it should be easy to fix for the upcoming release. > > > On Thu, Oct 5, 2017 at 11:06 PM, Chris Murphy < > chris.mur...@schrodinger.com> wrote: > >> Hi! >> >> I'm running at an issue with implicit hydrogens on aromatic heteroatoms. >> I am feeding the following sdf into a mol object: >> >> >> Mrv16c5 10021719092D >> >> 28 31 0 0 0 0999 V2000 >>-2.90000.20760. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-2.18710.62280. N 0 0 0 0 0 0 0 0 0 0 0 0 >> 0.00630.29570. N 0 0 0 0 0 0 0 0 0 0 0 0 >>-0.74860.63540. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-1.46570.21180. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 0.55990.91640. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-2.9000 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-2.1829 -1.03800. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-1.4657 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-3.62140.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-0.67311.45740. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 0.14051.63350. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 3.86460.59760. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 1.38600.83670. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 3.03840.67310. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.34691.26870. O 0 0 0 0 0 0 0 0 0 0 0 0 >> 1.72570.08180. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 1.86411.51610. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.68611.43220. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.5477 -0.00210. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-2.1829 -1.86830. Cl 0 0 0 0 0 0 0 0 0 0 0 0 >>-3.62971.45320. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.2085 -0.16560. O 0 0 0 0 0 0 0 0 0 0 0 0 >>-3.6214 -1.04630. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-4.33850.19920. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-4.3385 -0.63110. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-4.34691.86000. C 0 0 0 0 0 0 0 0 0 0 0 0 >>-2.90421.86830. C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2 1 1 0 0 0 0 >> 3 4 1 0 0 0 0 >> 4 5 1 0 0 0 0 >> 5 2 2 0 0 0 0 >> 6 3 1 0 0 0 0 >> 7 1 1 0 0 0 0 >> 8 7 1 0 0 0 0 >> 9 8 2 0 0 0 0 >> 10 1 2 0 0 0 0 >> 11 4 2 0 0 0 0 >> 12 11 1 0 0 0 0 >> 13 15 1 0 0 0 0 >> 14 6 1 0 0 0 0 >> 15 19 1 0 0 0 0 >> 16 13 2 0 0 0 0 >> 17 14 2 0 0 0 0 >> 18 14 1 0 0 0 0 >> 19 18 2 0 0 0 0 >> 20 17 1 0 0 0 0 >> 21 8 1 0 0 0 0 >> 22 10 1 0 0 0 0 >> 23 13 1 0 0 0 0 >> 24 7 2 0 0 0 0 >> 25 10 1 0 0 0 0 >> 26 25 2 0 0 0 0 >> 27 22 1 0 0 0 0 >> 28 22 1 0 0 0 0 >> 24 26 1 0 0 0 0 >> 9 5 1 0 0 0 0 >> 6 12 2 0 0 0 0 >> 20 15 2 0 0 0 0 >> M END >> >> >> The nitrogen in the heterocycle should have 1 implicit hydrogen on it, >> and when I look at it after initially creating the mol object, it does. I >> want to convert it to aromatic form, so I am calling >> rdmolops.SetAromatize(mol). Once I do this however, it seems that the >> implicit hydrogen on the nitrogen is removed, which then causes an error to >> be thrown if I ever try to convert it back to kekule form or do any kind of >> sanitization. Maybe my understanding of aromaticity is wrong, but shouldn't >> the hydrogen be on the nitrogen re
Re: [Rdkit-discuss] Implicit Hydrogens On Aromatic Hetereoatoms
Hi Chris, There's an additional step performed during sanitization that recognizes that the implicit H needs to be on the N. The steps of a normal full molecular sanitization operation are documented here: http://www.rdkit.org/docs/RDKit_Book.html#molecular-sanitization The adjustHs() function is not exposed directly to Python, but you can take care of aromaticity assignment and adjustHs in a single call with the SanitizeMol() function: In [21]: m = Chem.MolFromMolBlock(mb,sanitize=False) In [22]: Chem.SanitizeMol(m,sanitizeOps=Chem.SANITIZE_ SETAROMATICITY|Chem.SANITIZE_ADJUSTHS) Out[22]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE In [23]: Chem.MolToSmiles(m) Out[23]: 'CC(C)c12c(Cl)cc(-c3ccc(-c4ccc(C(=O)O)cc4)[nH]3)nc12' I highlighted the N in the heterocyle with the implicit H. Hopefully this helps. Best, -greg p.s. Note: while testing parts of this answer I uncovered a bug in AdjustHs() that causes it to fail for molecules that include atoms with "bad valences": https://github.com/rdkit/rdkit/issues/1605 This looks like it should be easy to fix for the upcoming release. On Thu, Oct 5, 2017 at 11:06 PM, Chris Murphy wrote: > Hi! > > I'm running at an issue with implicit hydrogens on aromatic heteroatoms. I > am feeding the following sdf into a mol object: > > > Mrv16c5 10021719092D > > 28 31 0 0 0 0999 V2000 >-2.90000.20760. C 0 0 0 0 0 0 0 0 0 0 0 0 >-2.18710.62280. N 0 0 0 0 0 0 0 0 0 0 0 0 > 0.00630.29570. N 0 0 0 0 0 0 0 0 0 0 0 0 >-0.74860.63540. C 0 0 0 0 0 0 0 0 0 0 0 0 >-1.46570.21180. C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.55990.91640. C 0 0 0 0 0 0 0 0 0 0 0 0 >-2.9000 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >-2.1829 -1.03800. C 0 0 0 0 0 0 0 0 0 0 0 0 >-1.4657 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >-3.62140.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 >-0.67311.45740. C 0 0 0 0 0 0 0 0 0 0 0 0 > 0.14051.63350. C 0 0 0 0 0 0 0 0 0 0 0 0 > 3.86460.59760. C 0 0 0 0 0 0 0 0 0 0 0 0 > 1.38600.83670. C 0 0 0 0 0 0 0 0 0 0 0 0 > 3.03840.67310. C 0 0 0 0 0 0 0 0 0 0 0 0 > 4.34691.26870. O 0 0 0 0 0 0 0 0 0 0 0 0 > 1.72570.08180. C 0 0 0 0 0 0 0 0 0 0 0 0 > 1.86411.51610. C 0 0 0 0 0 0 0 0 0 0 0 0 > 2.68611.43220. C 0 0 0 0 0 0 0 0 0 0 0 0 > 2.5477 -0.00210. C 0 0 0 0 0 0 0 0 0 0 0 0 >-2.1829 -1.86830. Cl 0 0 0 0 0 0 0 0 0 0 0 0 >-3.62971.45320. C 0 0 0 0 0 0 0 0 0 0 0 0 > 4.2085 -0.16560. O 0 0 0 0 0 0 0 0 0 0 0 0 >-3.6214 -1.04630. C 0 0 0 0 0 0 0 0 0 0 0 0 >-4.33850.19920. C 0 0 0 0 0 0 0 0 0 0 0 0 >-4.3385 -0.63110. C 0 0 0 0 0 0 0 0 0 0 0 0 >-4.34691.86000. C 0 0 0 0 0 0 0 0 0 0 0 0 >-2.90421.86830. C 0 0 0 0 0 0 0 0 0 0 0 0 > 2 1 1 0 0 0 0 > 3 4 1 0 0 0 0 > 4 5 1 0 0 0 0 > 5 2 2 0 0 0 0 > 6 3 1 0 0 0 0 > 7 1 1 0 0 0 0 > 8 7 1 0 0 0 0 > 9 8 2 0 0 0 0 > 10 1 2 0 0 0 0 > 11 4 2 0 0 0 0 > 12 11 1 0 0 0 0 > 13 15 1 0 0 0 0 > 14 6 1 0 0 0 0 > 15 19 1 0 0 0 0 > 16 13 2 0 0 0 0 > 17 14 2 0 0 0 0 > 18 14 1 0 0 0 0 > 19 18 2 0 0 0 0 > 20 17 1 0 0 0 0 > 21 8 1 0 0 0 0 > 22 10 1 0 0 0 0 > 23 13 1 0 0 0 0 > 24 7 2 0 0 0 0 > 25 10 1 0 0 0 0 > 26 25 2 0 0 0 0 > 27 22 1 0 0 0 0 > 28 22 1 0 0 0 0 > 24 26 1 0 0 0 0 > 9 5 1 0 0 0 0 > 6 12 2 0 0 0 0 > 20 15 2 0 0 0 0 > M END > > > The nitrogen in the heterocycle should have 1 implicit hydrogen on it, and > when I look at it after initially creating the mol object, it does. I want > to convert it to aromatic form, so I am calling rdmolops.SetAromatize(mol). > Once I do this however, it seems that the implicit hydrogen on the nitrogen > is removed, which then causes an error to be thrown if I ever try to > convert it back to kekule form or do any kind of sanitization. Maybe my > understanding of aromaticity is wrong, but shouldn't the hydrogen be on the > nitrogen regardless of whether or not it is considered to be in an aromatic > form? > > I could be misunderstanding rdkit's aromaticity models, but for my > purposes, I want to be able to convert the mol to either kekule or aromatic > form depending on some configuration settings. Is there a way to manipulate > t
[Rdkit-discuss] Implicit Hydrogens On Aromatic Hetereoatoms
Hi! I'm running at an issue with implicit hydrogens on aromatic heteroatoms. I am feeding the following sdf into a mol object: Mrv16c5 10021719092D 28 31 0 0 0 0999 V2000 -2.90000.20760. C 0 0 0 0 0 0 0 0 0 0 0 0 -2.18710.62280. N 0 0 0 0 0 0 0 0 0 0 0 0 0.00630.29570. N 0 0 0 0 0 0 0 0 0 0 0 0 -0.74860.63540. C 0 0 0 0 0 0 0 0 0 0 0 0 -1.46570.21180. C 0 0 0 0 0 0 0 0 0 0 0 0 0.55990.91640. C 0 0 0 0 0 0 0 0 0 0 0 0 -2.9000 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 -2.1829 -1.03800. C 0 0 0 0 0 0 0 0 0 0 0 0 -1.4657 -0.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 -3.62140.62280. C 0 0 0 0 0 0 0 0 0 0 0 0 -0.67311.45740. C 0 0 0 0 0 0 0 0 0 0 0 0 0.14051.63350. C 0 0 0 0 0 0 0 0 0 0 0 0 3.86460.59760. C 0 0 0 0 0 0 0 0 0 0 0 0 1.38600.83670. C 0 0 0 0 0 0 0 0 0 0 0 0 3.03840.67310. C 0 0 0 0 0 0 0 0 0 0 0 0 4.34691.26870. O 0 0 0 0 0 0 0 0 0 0 0 0 1.72570.08180. C 0 0 0 0 0 0 0 0 0 0 0 0 1.86411.51610. C 0 0 0 0 0 0 0 0 0 0 0 0 2.68611.43220. C 0 0 0 0 0 0 0 0 0 0 0 0 2.5477 -0.00210. C 0 0 0 0 0 0 0 0 0 0 0 0 -2.1829 -1.86830. Cl 0 0 0 0 0 0 0 0 0 0 0 0 -3.62971.45320. C 0 0 0 0 0 0 0 0 0 0 0 0 4.2085 -0.16560. O 0 0 0 0 0 0 0 0 0 0 0 0 -3.6214 -1.04630. C 0 0 0 0 0 0 0 0 0 0 0 0 -4.33850.19920. C 0 0 0 0 0 0 0 0 0 0 0 0 -4.3385 -0.63110. C 0 0 0 0 0 0 0 0 0 0 0 0 -4.34691.86000. C 0 0 0 0 0 0 0 0 0 0 0 0 -2.90421.86830. C 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 0 0 0 3 4 1 0 0 0 0 4 5 1 0 0 0 0 5 2 2 0 0 0 0 6 3 1 0 0 0 0 7 1 1 0 0 0 0 8 7 1 0 0 0 0 9 8 2 0 0 0 0 10 1 2 0 0 0 0 11 4 2 0 0 0 0 12 11 1 0 0 0 0 13 15 1 0 0 0 0 14 6 1 0 0 0 0 15 19 1 0 0 0 0 16 13 2 0 0 0 0 17 14 2 0 0 0 0 18 14 1 0 0 0 0 19 18 2 0 0 0 0 20 17 1 0 0 0 0 21 8 1 0 0 0 0 22 10 1 0 0 0 0 23 13 1 0 0 0 0 24 7 2 0 0 0 0 25 10 1 0 0 0 0 26 25 2 0 0 0 0 27 22 1 0 0 0 0 28 22 1 0 0 0 0 24 26 1 0 0 0 0 9 5 1 0 0 0 0 6 12 2 0 0 0 0 20 15 2 0 0 0 0 M END The nitrogen in the heterocycle should have 1 implicit hydrogen on it, and when I look at it after initially creating the mol object, it does. I want to convert it to aromatic form, so I am calling rdmolops.SetAromatize(mol). Once I do this however, it seems that the implicit hydrogen on the nitrogen is removed, which then causes an error to be thrown if I ever try to convert it back to kekule form or do any kind of sanitization. Maybe my understanding of aromaticity is wrong, but shouldn't the hydrogen be on the nitrogen regardless of whether or not it is considered to be in an aromatic form? I could be misunderstanding rdkit's aromaticity models, but for my purposes, I want to be able to convert the mol to either kekule or aromatic form depending on some configuration settings. Is there a way to manipulate the hydogrens on an atom in this case? Thanks! Chris -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss