Re: [Rdkit-discuss] InChI -> Mol(2D) -> InChI

2019-01-10 Thread Greg Landrum
Hi Jean-Marc,

The problem here is being caused by a bug/deficiency in the RDKit's 2D
coordinate generation algorithm.
The molecule described by the input InChI.has a trans double bond in the
macrocycle and Compute2DCoords() isn't handling that correctly.
Here are the coordinates you get for that molecule, with the bond in
question highlighted:
[image: image.png]
and here's what you get when using rdCoordGen.AddCoords (which uses the
CoordGen algorithm instead of the RDKit's algorithm):
[image: image.png]
The InChI generated for that second example matches the input:

m3 = Chem.MolFromInchi(inchi_initial)
from rdkit.Chem import rdCoordGen
rdCoordGen.AddCoords(m3)
inchi_final2 = Chem.MolToInchi(m3)
print(inchi_initial == inchi_final2)  # prints "True"


Here's a gist showing a bit more detail:
https://gist.github.com/greglandrum/434ac2e7dc69bbe8c6a9c319ef284f69

The problems with the RDKit's 2D coordinate generation for macrocycles are
unfortunate, but unlikely to be resolved in the near future. In the
meantime, if you have macrocycles with trans double bonds and you need
coordinates with accurate stereochemistry, it's probably a better idea to
use the CoordGen algorithm.[1]

-greg
[1] CoordGen isn't the default, even when its installed, because it can be
slow relative to the RDKit algorithm.


On Thu, Jan 10, 2019 at 4:21 PM Jean-Marc Nuzillard <
jm.nuzill...@univ-reims.fr> wrote:

> Dear all,
>
> I wrote some time ago about adding 3D coordinates to atoms in a molecule
> that was created from an InChI string.
> The conversion of the molecule to InChI dis not produce the initial
> InChI due to the presence of an intracyclic double bond.
> I face them same problem with the generation of 2D coordinates:
>
> inchi_initial =
>
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m = Chem.MolFromInchi(inchi_initial)
> AllChem.Compute2DCoords(m)
> inchi_final = Chem.MolToInchi(m)
> print(inchi_initial)
> print(inchi_final)
> print(inchi_initial == inchi_final) # returns False
>
> Is there something I can do to avoid this?
> The difference between the initial and final InChI strings is in the
> geometry of the double bonds.
> All the best,
>
> Jean-Marc
>
> --
> Jean-Marc Nuzillard
> Directeur de Recherches au CNRS
>
> Institut de Chimie Moléculaire de Reims
> CNRS UMR 7312
> Moulin de la Housse
> CPCBAI, Bâtiment 18
> BP 1039
> 51687 REIMS Cedex 2
> France
>
> Tel : 03 26 91 82 10
> Fax : 03 26 91 31 66
> http://www.univ-reims.fr/ICMR
> http://eos.univ-reims.fr/LSD/CSNteam.html
>
> http://www.univ-reims.fr/LSD/
> http://www.univ-reims.fr/LSD/JmnSoft/
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>
>
>
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] InChI -> Mol(2D) -> InChI

2019-01-10 Thread Jean-Marc Nuzillard

Dear all,

I wrote some time ago about adding 3D coordinates to atoms in a molecule 
that was created from an InChI string.
The conversion of the molecule to InChI dis not produce the initial 
InChI due to the presence of an intracyclic double bond.

I face them same problem with the generation of 2D coordinates:

inchi_initial = 
"InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"

m = Chem.MolFromInchi(inchi_initial)
AllChem.Compute2DCoords(m)
inchi_final = Chem.MolToInchi(m)
print(inchi_initial)
print(inchi_final)
print(inchi_initial == inchi_final) # returns False

Is there something I can do to avoid this?
The difference between the initial and final InChI strings is in the 
geometry of the double bonds.

All the best,

Jean-Marc

--
Jean-Marc Nuzillard
Directeur de Recherches au CNRS

Institut de Chimie Moléculaire de Reims
CNRS UMR 7312
Moulin de la Housse
CPCBAI, Bâtiment 18
BP 1039
51687 REIMS Cedex 2
France

Tel : 03 26 91 82 10
Fax : 03 26 91 31 66
http://www.univ-reims.fr/ICMR
http://eos.univ-reims.fr/LSD/CSNteam.html

http://www.univ-reims.fr/LSD/
http://www.univ-reims.fr/LSD/JmnSoft/


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss