Re: [Rdkit-discuss] Inchi/smiles conversion issue
Yes, this is a well known problem: first of all, if there is more than one chemist present, you can always have a long discussions about what the most stable tautomeric form of a given compound (under certain conditions) might be, however, in case of InChI, if you ask the algorithm for the tautomer-invariant representation of a compound, i.e., the canonical tautomer (and the Standard InChI does this inherently), everybody agrees that in quite many cases it is quite an odd tautomer the InChI algorithm choose for the canonical one :-) Markus - | Markus Sitzmann | markus.sitzm...@gmail.com > On 18. Jun 2019, at 18:41, Alexis Parenty > wrote: > > Dear Jennifer, > Many thanks for your response. Very useful tutorial on Inchi. I did not know > about the FixedH option: > inchi = Chem.MolToInchi(mol, options='/FixedH') > Best, > Alexis > >> On Tue, 18 Jun 2019 at 13:20, Jennifer Hemmerich >> wrote: >> Dear Alexis, >> >> if you calculate the Standard Inchi it is invariant to tautomers (see here: >> https://www.inchi-trust.org/technical-faq-2/#6.1). Therefore the information >> which tautomer was converted is lost due to the Inchi conversion. If you >> want to keep the tautomer information you need to use the fixedH attribute >> for the inchi. But beware this makes it a non standard Inchi, and thus might >> not be comparable to other Inchis. >> >> Hope this helps, >> >> Jennifer >> >>> On 18.06.19 12:59, Alexis Parenty wrote: >>> Dear RdKiters, >>> >>> Why is it that the stable tautomer of the following structure is lost >>> during inchi/smiles conversion? >>> >>> >>> >>> >>> mol = Chem.MolFromSmiles("Cc1ccc([nH]nc2)c2c1") >>> inchi = Chem.MolToInchi(mol) >>> mol = Chem.MolFromInchi(inchi) >>> smiles = Chem.MolToSmiles(mol) >>> print(smiles) >>> >>> ==> Cc1ccc2n[nH]cc2c1 >>> >>> >>> The H has shifted on the wrong Nitrogen… >>> >>> Interestingly, if you remove the methyl, the shift no longer happens: >>> >>> mol = Chem.MolFromSmiles("c1([nH]nc2)c21") >>> inchi = Chem.MolToInchi(mol) >>> mol = Chem.MolFromInchi(inchi) >>> smiles = Chem.MolToSmiles(mol) >>> print(smiles) >>> ==> c1([nH]nc2)c21 >>> >>> >>> Same issue for any secondary amides: if you pass the smiles of a secondary >>> amide, you end-up with the following unstable tautomer: >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Alexis >>> >>> >>> >>> >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Inchi/smiles conversion issue
Dear Jennifer, Many thanks for your response. Very useful tutorial on Inchi. I did not know about the FixedH option: inchi = Chem.MolToInchi(mol, options='/FixedH') Best, Alexis On Tue, 18 Jun 2019 at 13:20, Jennifer Hemmerich wrote: > Dear Alexis, > > if you calculate the Standard Inchi it is invariant to tautomers (see > here: https://www.inchi-trust.org/technical-faq-2/#6.1). Therefore the > information which tautomer was converted is lost due to the Inchi > conversion. If you want to keep the tautomer information you need to use > the fixedH attribute for the inchi. But beware this makes it a non standard > Inchi, and thus might not be comparable to other Inchis. > > Hope this helps, > > Jennifer > On 18.06.19 12:59, Alexis Parenty wrote: > > Dear RdKiters, > > Why is it that the stable tautomer of the following structure is lost > during inchi/smiles conversion? > > > [image: image.png] > > mol = Chem.MolFromSmiles(*"Cc1ccc([nH]nc2)c2c1"*) > inchi = Chem.MolToInchi(mol) > mol = Chem.MolFromInchi(inchi) > smiles = Chem.MolToSmiles(mol) > print(smiles) > > *==> Cc1ccc2n[nH]cc2c1* > > > > The H has shifted on the wrong Nitrogen… > > Interestingly, if you remove the methyl, the shift no longer happens: > > mol = Chem.MolFromSmiles(*"c1([nH]nc2)c21"*) > inchi = Chem.MolToInchi(mol) > mol = Chem.MolFromInchi(inchi) > smiles = Chem.MolToSmiles(mol)print(smiles)==> *c1([nH]nc2)c21* > > > > Same issue for any secondary amides: if you pass the smiles of a secondary > amide, you end-up with the following unstable tautomer: > > > [image: image.png] > > Thanks, > > > > Alexis > > > ___ > Rdkit-discuss mailing > listRdkit-discuss@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Inchi/smiles conversion issue
Dear Alexis, if you calculate the Standard Inchi it is invariant to tautomers (see here: https://www.inchi-trust.org/technical-faq-2/#6.1). Therefore the information which tautomer was converted is lost due to the Inchi conversion. If you want to keep the tautomer information you need to use the fixedH attribute for the inchi. But beware this makes it a non standard Inchi, and thus might not be comparable to other Inchis. Hope this helps, Jennifer On 18.06.19 12:59, Alexis Parenty wrote: Dear RdKiters, Why is it that the stable tautomer of the following structure is lost during inchi/smiles conversion? image.png mol = Chem.MolFromSmiles(*"Cc1ccc([nH]nc2)c2c1"*) inchi = Chem.MolToInchi(mol) mol = Chem.MolFromInchi(inchi) smiles = Chem.MolToSmiles(mol) print(smiles) /==> Cc1ccc2n[nH]cc2c1/ The H has shifted on the wrong Nitrogen… Interestingly, if you remove the methyl, the shift no longer happens: mol = Chem.MolFromSmiles(*"c1([nH]nc2)c21"*) inchi = Chem.MolToInchi(mol) mol = Chem.MolFromInchi(inchi) smiles = Chem.MolToSmiles(mol) print(smiles) ==> *c1([nH]nc2)c21* Same issue for any secondary amides: if you pass the smiles of a secondary amide, you end-up with the following unstable tautomer: image.png Thanks, Alexis ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] Inchi/smiles conversion issue
Dear RdKiters, Why is it that the stable tautomer of the following structure is lost during inchi/smiles conversion? [image: image.png] mol = Chem.MolFromSmiles(*"Cc1ccc([nH]nc2)c2c1"*) inchi = Chem.MolToInchi(mol) mol = Chem.MolFromInchi(inchi) smiles = Chem.MolToSmiles(mol) print(smiles) *==> Cc1ccc2n[nH]cc2c1* The H has shifted on the wrong Nitrogen… Interestingly, if you remove the methyl, the shift no longer happens: mol = Chem.MolFromSmiles(*"c1([nH]nc2)c21"*) inchi = Chem.MolToInchi(mol) mol = Chem.MolFromInchi(inchi) smiles = Chem.MolToSmiles(mol)print(smiles) ==> *c1([nH]nc2)c21* Same issue for any secondary amides: if you pass the smiles of a secondary amide, you end-up with the following unstable tautomer: [image: image.png] Thanks, Alexis ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss