see https://github.com/rdkit/rdkit/issues/1852, and
https://sourceforge.net/p/rdkit/mailman/message/36309813/

You can see it in the smiles if you remove stereo after embedding, then
re-detect stereo from the conformation.

inchi1 =
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
m3 = Chem.Mol(m2)
Chem.rdmolops.RemoveStereochemistry(m3)
Chem.rdmolops.AssignStereochemistryFrom3D(m3)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
sm3 = Chem.MolToSmiles(m3)
print(sm1 == sm2)  # returns true
print(sm2 == sm3) # returns false


The difference between sm2 and sm3 is just swapping a \ for a /, confirming
what Christos was able to read from the InChI.

Why does the inchi reflect the 3D bond stereo but the smiles doesn't until
you remove and re-detect the stereo?  Does the InChI code go to the 3D
structure when present and ignore stereo information in the mol object?

Jason Biggs


On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas <chriskan...@gmail.com>
wrote:

> Hi Jean-Marc,
>
> There difference is due to bond orientation (if my inchi analysis skills
> are correct).
> See the bold bond layer below (14-7+ vs 14-7-).
>
>
> m1 -> 
> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1
>
> m2 -> 
> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1
>
>
> Not sure why it happens, but I've seen it multiple times...
>
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Chem[o]informatics Researcher & Software Developer
>
> [image: View Christos Kannas's profile on LinkedIn]
> <http://cy.linkedin.com/in/christoskannas>
>
>
> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
> jm.nuzill...@univ-reims.fr> wrote:
>
>> Thank you for your answer but alatis might not be adapted to my current
>> problem.
>>
>> Attempting to understand what was changed by the embedding step I wrote:
>>
>> inchi1 =
>>
>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>> m1 = Chem.MolFromInchi(inchi1)
>> m1 = Chem.AddHs(m1)
>> m2 = Chem.Mol(m1)
>> AllChem.EmbedMolecule(m2)
>> sm1 = Chem.MolToSmiles(m1)
>> sm2 = Chem.MolToSmiles(m2)
>> print(sm1)
>> print(sm2)
>> print(sm1 == sm2)
>> inc1 = Chem.MolToInchi(m1)
>> inc2 = Chem.MolToInchi(m2)
>> print(inc1)
>> print(inc2)
>> print(inc1 == inc2)
>>
>> Molecules m1 and m2 have identical SMILES representations
>> but different InChI representations, which I find odd.
>>
>> All the best,
>>
>> Jean-Marc
>>
>>
>>
>>
>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
>> >> Is there any more deterministic procedure than the one of trying until
>> >> success is obtained?
>> >>
>> >> How do I determine the InChI string of a conformer obtained after
>> >> multiple embedding?
>> >
>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
>> >
>> > Generally speaking the problem with InChI is that the only *required*
>> > layer is the formula. Therefore *an* InChI string cannot be used to
>> > differentiate conformers, you need the InChI string with all the
>> > relevant layers and all the protons.
>> >
>> > https://www.nature.com/articles/sdata201773
>> >
>> > _______________________________________________
>> > Rdkit-discuss mailing list
>> > Rdkit-discuss@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to