I think I do vaguely remember that InChI gives precedence to 3D coordinates if present over anything else for the determination of stereochemistry. And I think that is what happens here: the Allchem embedding of the molecule adds 3D coordinates which are not present for the original molecule create straight from InChI. Probably the minimization of the structure during the embedding is “turning around” the stereochemistry (probably you could have a long discussion whether this is a bug or a feature),
Markus ------------------------------------- | Markus Sitzmann | markus.sitzm...@gmail.com > On 18. Dec 2018, at 19:43, Jason Biggs <jasondbi...@gmail.com> wrote: > > see https://github.com/rdkit/rdkit/issues/1852, and > https://sourceforge.net/p/rdkit/mailman/message/36309813/ > > You can see it in the smiles if you remove stereo after embedding, then > re-detect stereo from the conformation. > > inchi1 = > "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" > m1 = Chem.MolFromInchi(inchi1) > m1 = Chem.AddHs(m1) > m2 = Chem.Mol(m1) > AllChem.EmbedMolecule(m2) > m3 = Chem.Mol(m2) > Chem.rdmolops.RemoveStereochemistry(m3) > Chem.rdmolops.AssignStereochemistryFrom3D(m3) > sm1 = Chem.MolToSmiles(m1) > sm2 = Chem.MolToSmiles(m2) > sm3 = Chem.MolToSmiles(m3) > print(sm1 == sm2) # returns true > print(sm2 == sm3) # returns false > > The difference between sm2 and sm3 is just swapping a \ for a /, confirming > what Christos was able to read from the InChI. > > Why does the inchi reflect the 3D bond stereo but the smiles doesn't until > you remove and re-detect the stereo? Does the InChI code go to the 3D > structure when present and ignore stereo information in the mol object? > > Jason Biggs > > >> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas <chriskan...@gmail.com> >> wrote: >> Hi Jean-Marc, >> >> There difference is due to bond orientation (if my inchi analysis skills are >> correct). >> See the bold bond layer below (14-7+ vs 14-7-). >> >> m1 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1 >> >> m2 -> >> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7-/t17-,19-/m1/s1 >> >> Not sure why it happens, but I've seen it multiple times... >> >> Best, >> Christos >> >> Christos Kannas >> >> Chem[o]informatics Researcher & Software Developer >> >> >> >> >> >>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD >>> <jm.nuzill...@univ-reims.fr> wrote: >>> Thank you for your answer but alatis might not be adapted to my current >>> problem. >>> >>> Attempting to understand what was changed by the embedding step I wrote: >>> >>> inchi1 = >>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" >>> m1 = Chem.MolFromInchi(inchi1) >>> m1 = Chem.AddHs(m1) >>> m2 = Chem.Mol(m1) >>> AllChem.EmbedMolecule(m2) >>> sm1 = Chem.MolToSmiles(m1) >>> sm2 = Chem.MolToSmiles(m2) >>> print(sm1) >>> print(sm2) >>> print(sm1 == sm2) >>> inc1 = Chem.MolToInchi(m1) >>> inc2 = Chem.MolToInchi(m2) >>> print(inc1) >>> print(inc2) >>> print(inc1 == inc2) >>> >>> Molecules m1 and m2 have identical SMILES representations >>> but different InChI representations, which I find odd. >>> >>> All the best, >>> >>> Jean-Marc >>> >>> >>> >>> >>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit : >>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote: >>> >> Is there any more deterministic procedure than the one of trying until >>> >> success is obtained? >>> >> >>> >> How do I determine the InChI string of a conformer obtained after >>> >> multiple embedding? >>> > >>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/ >>> > >>> > Generally speaking the problem with InChI is that the only *required* >>> > layer is the formula. Therefore *an* InChI string cannot be used to >>> > differentiate conformers, you need the InChI string with all the >>> > relevant layers and all the protons. >>> > >>> > https://www.nature.com/articles/sdata201773 >>> > >>> > _______________________________________________ >>> > Rdkit-discuss mailing list >>> > Rdkit-discuss@lists.sourceforge.net >>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss