see https://github.com/rdkit/rdkit/issues/1852, and
https://sourceforge.net/p/rdkit/mailman/message/36309813/
You can see it in the smiles if you remove stereo after embedding,
then re-detect stereo from the conformation.
inchi1 =
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
m3 = Chem.Mol(m2)
Chem.rdmolops.RemoveStereochemistry(m3)
Chem.rdmolops.AssignStereochemistryFrom3D(m3)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
sm3 = Chem.MolToSmiles(m3)
print(sm1 == sm2) # returns true
print(sm2 == sm3) # returns false
The difference between sm2 and sm3 is just swapping a \ for a /,
confirming what Christos was able to read from the InChI.
Why does the inchi reflect the 3D bond stereo but the smiles doesn't
until you remove and re-detect the stereo? Does the InChI code go to
the 3D structure when present and ignore stereo information in the mol
object?
Jason Biggs
On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas
<chriskan...@gmail.com> wrote:
Hi Jean-Marc,
There difference is due to bond orientation (if my inchi analysis
skills are correct).
See the bold bond layer below (14-7+ vs 14-7-).
m1 ->
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7+/t17-,19-/m1/s1
m2 ->
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7-/t17-,19-/m1/s1
Not sure why it happens, but I've seen it multiple times...
Best,
Christos
Christos Kannas
Chem[o]informatics Researcher & Software Developer
[1]
On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD
<jm.nuzill...@univ-reims.fr> wrote:
Thank you for your answer but alatis might not be adapted to my
current
problem.
Attempting to understand what was changed by the embedding step I
wrote:
inchi1 =
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
print(sm1)
print(sm2)
print(sm1 == sm2)
inc1 = Chem.MolToInchi(m1)
inc2 = Chem.MolToInchi(m2)
print(inc1)
print(inc2)
print(inc1 == inc2)
Molecules m1 and m2 have identical SMILES representations
but different InChI representations, which I find odd.
All the best,
Jean-Marc
Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
Is there any more deterministic procedure than the one of
trying until
success is obtained?
How do I determine the InChI string of a conformer obtained
after
multiple embedding?
This representation keeps 3D config:
http://alatis.nmrfam.wisc.edu/
Generally speaking the problem with InChI is that the only
*required*
layer is the formula. Therefore *an* InChI string cannot be used
to
differentiate conformers, you need the InChI string with all the
relevant layers and all the protons.
https://www.nature.com/articles/sdata201773
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Links:
------
[1] http://cy.linkedin.com/in/christoskannas