Thank you Christos for confirming what I suspected about double bond geometry
and thank you Jason for finding some logics in my observation.

Dimitri, how can alatis help me to find a first draft of 3D structure
for a few ten thousands of compounds from InChI strings?

All the best,

Jean-Marc


Le 18/12/2018 19:43, Jason Biggs a écrit :
see https://github.com/rdkit/rdkit/issues/1852, and
https://sourceforge.net/p/rdkit/mailman/message/36309813/

You can see it in the smiles if you remove stereo after embedding,
then re-detect stereo from the conformation.

inchi1 =

"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"

m1 = Chem.MolFromInchi(inchi1)

m1 = Chem.AddHs(m1)

m2 = Chem.Mol(m1)

AllChem.EmbedMolecule(m2)

m3 = Chem.Mol(m2)

Chem.rdmolops.RemoveStereochemistry(m3)

Chem.rdmolops.AssignStereochemistryFrom3D(m3)

sm1 = Chem.MolToSmiles(m1)

sm2 = Chem.MolToSmiles(m2)

sm3 = Chem.MolToSmiles(m3)

print(sm1 == sm2)  # returns true

print(sm2 == sm3) # returns false

The difference between sm2 and sm3 is just swapping a \ for a /,
confirming what Christos was able to read from the InChI.

Why does the inchi reflect the 3D bond stereo but the smiles doesn't
until you remove and re-detect the stereo?  Does the InChI code go to
the 3D structure when present and ignore stereo information in the mol
object?

Jason Biggs

On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas
<chriskan...@gmail.com> wrote:

Hi Jean-Marc,

There difference is due to bond orientation (if my inchi analysis
skills are correct).
See the bold bond layer below (14-7+ vs 14-7-).

m1 ->

InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7+/t17-,19-/m1/s1

m2 ->

InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7-/t17-,19-/m1/s1

Not sure why it happens, but I've seen it multiple times...

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[1]

On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD
<jm.nuzill...@univ-reims.fr> wrote:

Thank you for your answer but alatis might not be adapted to my
current
problem.

Attempting to understand what was changed by the embedding step I
wrote:

inchi1 =


"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
print(sm1)
print(sm2)
print(sm1 == sm2)
inc1 = Chem.MolToInchi(m1)
inc2 = Chem.MolToInchi(m2)
print(inc1)
print(inc2)
print(inc1 == inc2)

Molecules m1 and m2 have identical SMILES representations
but different InChI representations, which I find odd.

All the best,

Jean-Marc

Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
Is there any more deterministic procedure than the one of
trying until
success is obtained?

How do I determine the InChI string of a conformer obtained
after
multiple embedding?

This representation keeps 3D config:
http://alatis.nmrfam.wisc.edu/

Generally speaking the problem with InChI is that the only
*required*
layer is the formula. Therefore *an* InChI string cannot be used
to
differentiate conformers, you need the InChI string with all the
relevant layers and all the protons.

https://www.nature.com/articles/sdata201773

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Links:
------
[1] http://cy.linkedin.com/in/christoskannas


_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to