Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Christos Kannas
I would call it a "feature"...
I guess running conformer optimization (i.e. ETKDG, UFF, MMFF94) after the
embedding would be a good practice...

> I think I do vaguely remember that InChI gives precedence to 3D
coordinates if present over anything else for the determination of
stereochemistry.
I guess that's why there are inconsistencies [sometimes] when the molecule
has been generated from a SMILES instead from a MOL block with 2D or 3D
coordinates...

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]



On Wed, 19 Dec 2018 at 01:12, Markus Sitzmann 
wrote:

> I think I do vaguely remember that InChI gives precedence to 3D
> coordinates if present over anything else for the determination of
> stereochemistry. And I think that is what happens here: the Allchem
> embedding of the molecule adds 3D coordinates which are not present for the
> original  molecule create straight from InChI. Probably the minimization of
> the structure during the embedding is “turning around” the stereochemistry
> (probably you could have a long discussion whether this is a bug or a
> feature),
>
> Markus
>
> -
> |  Markus Sitzmann
> |  markus.sitzm...@gmail.com
>
> On 18. Dec 2018, at 19:43, Jason Biggs  wrote:
>
> see https://github.com/rdkit/rdkit/issues/1852, and
> https://sourceforge.net/p/rdkit/mailman/message/36309813/
>
> You can see it in the smiles if you remove stereo after embedding, then
> re-detect stereo from the conformation.
>
> inchi1 =
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> m3 = Chem.Mol(m2)
> Chem.rdmolops.RemoveStereochemistry(m3)
> Chem.rdmolops.AssignStereochemistryFrom3D(m3)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> sm3 = Chem.MolToSmiles(m3)
> print(sm1 == sm2)  # returns true
> print(sm2 == sm3) # returns false
>
>
> The difference between sm2 and sm3 is just swapping a \ for a /,
> confirming what Christos was able to read from the InChI.
>
> Why does the inchi reflect the 3D bond stereo but the smiles doesn't until
> you remove and re-detect the stereo?  Does the InChI code go to the 3D
> structure when present and ignore stereo information in the mol object?
>
> Jason Biggs
>
>
> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas 
> wrote:
>
>> Hi Jean-Marc,
>>
>> There difference is due to bond orientation (if my inchi analysis skills
>> are correct).
>> See the bold bond layer below (14-7+ vs 14-7-).
>>
>>
>> m1 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1
>>
>> m2 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1
>>
>>
>> Not sure why it happens, but I've seen it multiple times...
>>
>>
>> Best,
>>
>> Christos
>>
>> Christos Kannas
>>
>> Chem[o]informatics Researcher & Software Developer
>>
>> [image: View Christos Kannas's profile on LinkedIn]
>> 
>>
>>
>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
>> jm.nuzill...@univ-reims.fr> wrote:
>>
>>> Thank you for your answer but alatis might not be adapted to my current
>>> problem.
>>>
>>> Attempting to understand what was changed by the embedding step I wrote:
>>>
>>> inchi1 =
>>>
>>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>>> m1 = Chem.MolFromInchi(inchi1)
>>> m1 = Chem.AddHs(m1)
>>> m2 = Chem.Mol(m1)
>>> AllChem.EmbedMolecule(m2)
>>> sm1 = Chem.MolToSmiles(m1)
>>> sm2 = Chem.MolToSmiles(m2)
>>> print(sm1)
>>> print(sm2)
>>> print(sm1 == sm2)
>>> inc1 = Chem.MolToInchi(m1)
>>> inc2 = Chem.MolToInchi(m2)
>>> print(inc1)
>>> print(inc2)
>>> print(inc1 == inc2)
>>>
>>> Molecules m1 and m2 have identical SMILES representations
>>> but different InChI representations, which I find odd.
>>>
>>> All the best,
>>>
>>> Jean-Marc
>>>
>>>
>>>
>>>
>>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
>>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
>>> >> Is there any more deterministic procedure than the one of trying until
>>> >> success is obtained?
>>> >>
>>> >> How do I determine the InChI string of a conformer obtained after
>>> >> multiple embedding?
>>> >
>>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
>>> >
>>> > Generally speaking the problem with InChI is that the only *required*
>>> > layer is the formula. Therefore *an* InChI string cannot be used to
>>> > differentiate conformers, you need the InChI string with all the
>>> > relevant 

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Markus Sitzmann
I think I do vaguely remember that InChI gives precedence to 3D coordinates if 
present over anything else for the determination of stereochemistry. And I 
think that is what happens here: the Allchem embedding of the molecule adds 3D 
coordinates which are not present for the original  molecule create straight 
from InChI. Probably the minimization of the structure during the embedding is 
“turning around” the stereochemistry (probably you could have a long discussion 
whether this is a bug or a feature),

Markus

-
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 18. Dec 2018, at 19:43, Jason Biggs  wrote:
> 
> see https://github.com/rdkit/rdkit/issues/1852, and 
> https://sourceforge.net/p/rdkit/mailman/message/36309813/
> 
> You can see it in the smiles if you remove stereo after embedding, then 
> re-detect stereo from the conformation.
> 
> inchi1 = 
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> m3 = Chem.Mol(m2)
> Chem.rdmolops.RemoveStereochemistry(m3)
> Chem.rdmolops.AssignStereochemistryFrom3D(m3)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> sm3 = Chem.MolToSmiles(m3)
> print(sm1 == sm2)  # returns true
> print(sm2 == sm3) # returns false
> 
> The difference between sm2 and sm3 is just swapping a \ for a /, confirming 
> what Christos was able to read from the InChI.
> 
> Why does the inchi reflect the 3D bond stereo but the smiles doesn't until 
> you remove and re-detect the stereo?  Does the InChI code go to the 3D 
> structure when present and ignore stereo information in the mol object?
> 
> Jason Biggs
> 
> 
>> On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas  
>> wrote:
>> Hi Jean-Marc,
>> 
>> There difference is due to bond orientation (if my inchi analysis skills are 
>> correct).
>> See the bold bond layer below (14-7+ vs 14-7-).
>> 
>> m1 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1
>> 
>> m2 -> 
>> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7-/t17-,19-/m1/s1
>> 
>> Not sure why it happens, but I've seen it multiple times...
>> 
>> Best,
>> Christos
>> 
>> Christos Kannas
>> 
>> Chem[o]informatics Researcher & Software Developer
>> 
>> 
>> 
>> 
>> 
>>> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD 
>>>  wrote:
>>> Thank you for your answer but alatis might not be adapted to my current 
>>> problem.
>>> 
>>> Attempting to understand what was changed by the embedding step I wrote:
>>> 
>>> inchi1 = 
>>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>>> m1 = Chem.MolFromInchi(inchi1)
>>> m1 = Chem.AddHs(m1)
>>> m2 = Chem.Mol(m1)
>>> AllChem.EmbedMolecule(m2)
>>> sm1 = Chem.MolToSmiles(m1)
>>> sm2 = Chem.MolToSmiles(m2)
>>> print(sm1)
>>> print(sm2)
>>> print(sm1 == sm2)
>>> inc1 = Chem.MolToInchi(m1)
>>> inc2 = Chem.MolToInchi(m2)
>>> print(inc1)
>>> print(inc2)
>>> print(inc1 == inc2)
>>> 
>>> Molecules m1 and m2 have identical SMILES representations
>>> but different InChI representations, which I find odd.
>>> 
>>> All the best,
>>> 
>>> Jean-Marc
>>> 
>>> 
>>> 
>>> 
>>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
>>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
>>> >> Is there any more deterministic procedure than the one of trying until
>>> >> success is obtained?
>>> >> 
>>> >> How do I determine the InChI string of a conformer obtained after
>>> >> multiple embedding?
>>> > 
>>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
>>> > 
>>> > Generally speaking the problem with InChI is that the only *required*
>>> > layer is the formula. Therefore *an* InChI string cannot be used to
>>> > differentiate conformers, you need the InChI string with all the
>>> > relevant layers and all the protons.
>>> > 
>>> > https://www.nature.com/articles/sdata201773
>>> > 
>>> > ___
>>> > Rdkit-discuss mailing list
>>> > Rdkit-discuss@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> 
>>> 
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> 

Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Dimitri Maziuk via Rdkit-discuss
On 12/18/18 1:57 PM, JEAN-MARC NUZILLARD wrote:

> Dimitri, how can alatis help me to find a first draft of 3D structure
> for a few ten thousands of compounds from InChI strings?

It won't, you have to feed it a 3D structure. However its InChI string
and/or MOL block will give you the same 3D structure with the same atom
labels on round-trip, *as long as you don't removeH/addH/recalculate
conformers etc.*

(At least on all molecules they tried and I think that includes the
entire PubChem.)
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread JEAN-MARC NUZILLARD
Thank you Christos for confirming what I suspected about double bond 
geometry

and thank you Jason for finding some logics in my observation.

Dimitri, how can alatis help me to find a first draft of 3D structure
for a few ten thousands of compounds from InChI strings?

All the best,

Jean-Marc


Le 18/12/2018 19:43, Jason Biggs a écrit :

see https://github.com/rdkit/rdkit/issues/1852, and
https://sourceforge.net/p/rdkit/mailman/message/36309813/

You can see it in the smiles if you remove stereo after embedding,
then re-detect stereo from the conformation.


inchi1 =


"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"


m1 = Chem.MolFromInchi(inchi1)

m1 = Chem.AddHs(m1)

m2 = Chem.Mol(m1)

AllChem.EmbedMolecule(m2)

m3 = Chem.Mol(m2)

Chem.rdmolops.RemoveStereochemistry(m3)

Chem.rdmolops.AssignStereochemistryFrom3D(m3)

sm1 = Chem.MolToSmiles(m1)

sm2 = Chem.MolToSmiles(m2)

sm3 = Chem.MolToSmiles(m3)

print(sm1 == sm2)  # returns true

print(sm2 == sm3) # returns false


The difference between sm2 and sm3 is just swapping a \ for a /,
confirming what Christos was able to read from the InChI.

Why does the inchi reflect the 3D bond stereo but the smiles doesn't
until you remove and re-detect the stereo?  Does the InChI code go to
the 3D structure when present and ignore stereo information in the mol
object?

Jason Biggs

On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas
 wrote:


Hi Jean-Marc,

There difference is due to bond orientation (if my inchi analysis
skills are correct).
See the bold bond layer below (14-7+ vs 14-7-).

m1 ->


InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7+/t17-,19-/m1/s1


m2 ->


InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/B13-6-,14-7-/t17-,19-/m1/s1


Not sure why it happens, but I've seen it multiple times...

Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[1]

On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD
 wrote:


Thank you for your answer but alatis might not be adapted to my
current
problem.

Attempting to understand what was changed by the embedding step I
wrote:

inchi1 =




"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"

m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
print(sm1)
print(sm2)
print(sm1 == sm2)
inc1 = Chem.MolToInchi(m1)
inc2 = Chem.MolToInchi(m2)
print(inc1)
print(inc2)
print(inc1 == inc2)

Molecules m1 and m2 have identical SMILES representations
but different InChI representations, which I find odd.

All the best,

Jean-Marc

Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :

On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:

Is there any more deterministic procedure than the one of

trying until

success is obtained?

How do I determine the InChI string of a conformer obtained

after

multiple embedding?


This representation keeps 3D config:

http://alatis.nmrfam.wisc.edu/


Generally speaking the problem with InChI is that the only

*required*

layer is the formula. Therefore *an* InChI string cannot be used

to

differentiate conformers, you need the InChI string with all the
relevant layers and all the protons.

https://www.nature.com/articles/sdata201773

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



Links:
--
[1] http://cy.linkedin.com/in/christoskannas



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Dimitri Maziuk via Rdkit-discuss
On 12/18/18 11:34 AM, JEAN-MARC NUZILLARD wrote:

> Molecules m1 and m2 have identical SMILES representations
> but different InChI representations, which I find odd.

*shrug* this is precisely why they came up with alatis: take a molecule
in any input format, round-trip it through any cheminformatics program,
there's 50% chance you'll get a different molecule out.

That's how chemistry works when it meets compsci.
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Jason Biggs
see https://github.com/rdkit/rdkit/issues/1852, and
https://sourceforge.net/p/rdkit/mailman/message/36309813/

You can see it in the smiles if you remove stereo after embedding, then
re-detect stereo from the conformation.

inchi1 =
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
m3 = Chem.Mol(m2)
Chem.rdmolops.RemoveStereochemistry(m3)
Chem.rdmolops.AssignStereochemistryFrom3D(m3)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
sm3 = Chem.MolToSmiles(m3)
print(sm1 == sm2)  # returns true
print(sm2 == sm3) # returns false


The difference between sm2 and sm3 is just swapping a \ for a /, confirming
what Christos was able to read from the InChI.

Why does the inchi reflect the 3D bond stereo but the smiles doesn't until
you remove and re-detect the stereo?  Does the InChI code go to the 3D
structure when present and ignore stereo information in the mol object?

Jason Biggs


On Tue, Dec 18, 2018 at 12:14 PM Christos Kannas 
wrote:

> Hi Jean-Marc,
>
> There difference is due to bond orientation (if my inchi analysis skills
> are correct).
> See the bold bond layer below (14-7+ vs 14-7-).
>
>
> m1 -> 
> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1
>
> m2 -> 
> InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1
>
>
> Not sure why it happens, but I've seen it multiple times...
>
>
> Best,
>
> Christos
>
> Christos Kannas
>
> Chem[o]informatics Researcher & Software Developer
>
> [image: View Christos Kannas's profile on LinkedIn]
> 
>
>
> On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
> jm.nuzill...@univ-reims.fr> wrote:
>
>> Thank you for your answer but alatis might not be adapted to my current
>> problem.
>>
>> Attempting to understand what was changed by the embedding step I wrote:
>>
>> inchi1 =
>>
>> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
>> m1 = Chem.MolFromInchi(inchi1)
>> m1 = Chem.AddHs(m1)
>> m2 = Chem.Mol(m1)
>> AllChem.EmbedMolecule(m2)
>> sm1 = Chem.MolToSmiles(m1)
>> sm2 = Chem.MolToSmiles(m2)
>> print(sm1)
>> print(sm2)
>> print(sm1 == sm2)
>> inc1 = Chem.MolToInchi(m1)
>> inc2 = Chem.MolToInchi(m2)
>> print(inc1)
>> print(inc2)
>> print(inc1 == inc2)
>>
>> Molecules m1 and m2 have identical SMILES representations
>> but different InChI representations, which I find odd.
>>
>> All the best,
>>
>> Jean-Marc
>>
>>
>>
>>
>> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
>> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
>> >> Is there any more deterministic procedure than the one of trying until
>> >> success is obtained?
>> >>
>> >> How do I determine the InChI string of a conformer obtained after
>> >> multiple embedding?
>> >
>> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
>> >
>> > Generally speaking the problem with InChI is that the only *required*
>> > layer is the formula. Therefore *an* InChI string cannot be used to
>> > differentiate conformers, you need the InChI string with all the
>> > relevant layers and all the protons.
>> >
>> > https://www.nature.com/articles/sdata201773
>> >
>> > ___
>> > Rdkit-discuss mailing list
>> > Rdkit-discuss@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread Christos Kannas
Hi Jean-Marc,

There difference is due to bond orientation (if my inchi analysis skills
are correct).
See the bold bond layer below (14-7+ vs 14-7-).


m1 -> 
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7+*/t17-,19-/m1/s1

m2 -> 
InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/*b13-6-,14-7-*/t17-,19-/m1/s1


Not sure why it happens, but I've seen it multiple times...


Best,

Christos

Christos Kannas

Chem[o]informatics Researcher & Software Developer

[image: View Christos Kannas's profile on LinkedIn]



On Tue, 18 Dec 2018 at 17:36, JEAN-MARC NUZILLARD <
jm.nuzill...@univ-reims.fr> wrote:

> Thank you for your answer but alatis might not be adapted to my current
> problem.
>
> Attempting to understand what was changed by the embedding step I wrote:
>
> inchi1 =
>
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m1 = Chem.MolFromInchi(inchi1)
> m1 = Chem.AddHs(m1)
> m2 = Chem.Mol(m1)
> AllChem.EmbedMolecule(m2)
> sm1 = Chem.MolToSmiles(m1)
> sm2 = Chem.MolToSmiles(m2)
> print(sm1)
> print(sm2)
> print(sm1 == sm2)
> inc1 = Chem.MolToInchi(m1)
> inc2 = Chem.MolToInchi(m2)
> print(inc1)
> print(inc2)
> print(inc1 == inc2)
>
> Molecules m1 and m2 have identical SMILES representations
> but different InChI representations, which I find odd.
>
> All the best,
>
> Jean-Marc
>
>
>
>
> Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :
> > On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:
> >> Is there any more deterministic procedure than the one of trying until
> >> success is obtained?
> >>
> >> How do I determine the InChI string of a conformer obtained after
> >> multiple embedding?
> >
> > This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/
> >
> > Generally speaking the problem with InChI is that the only *required*
> > layer is the formula. Therefore *an* InChI string cannot be used to
> > differentiate conformers, you need the InChI string with all the
> > relevant layers and all the protons.
> >
> > https://www.nature.com/articles/sdata201773
> >
> > ___
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] InChI to Mol to InChi

2018-12-18 Thread JEAN-MARC NUZILLARD
Thank you for your answer but alatis might not be adapted to my current 
problem.


Attempting to understand what was changed by the embedding step I wrote:

inchi1 = 
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"

m1 = Chem.MolFromInchi(inchi1)
m1 = Chem.AddHs(m1)
m2 = Chem.Mol(m1)
AllChem.EmbedMolecule(m2)
sm1 = Chem.MolToSmiles(m1)
sm2 = Chem.MolToSmiles(m2)
print(sm1)
print(sm2)
print(sm1 == sm2)
inc1 = Chem.MolToInchi(m1)
inc2 = Chem.MolToInchi(m2)
print(inc1)
print(inc2)
print(inc1 == inc2)

Molecules m1 and m2 have identical SMILES representations
but different InChI representations, which I find odd.

All the best,

Jean-Marc




Le 18/12/2018 00:40, Dimitri Maziuk via Rdkit-discuss a écrit :

On 12/17/18 4:50 PM, JEAN-MARC NUZILLARD wrote:

Is there any more deterministic procedure than the one of trying until
success is obtained?

How do I determine the InChI string of a conformer obtained after
multiple embedding?


This representation keeps 3D config: http://alatis.nmrfam.wisc.edu/

Generally speaking the problem with InChI is that the only *required*
layer is the formula. Therefore *an* InChI string cannot be used to
differentiate conformers, you need the InChI string with all the
relevant layers and all the protons.

https://www.nature.com/articles/sdata201773

___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss