Hi Jean-Marc,

The problem here is being caused by a bug/deficiency in the RDKit's 2D
coordinate generation algorithm.
The molecule described by the input InChI.has a trans double bond in the
macrocycle and Compute2DCoords() isn't handling that correctly.
Here are the coordinates you get for that molecule, with the bond in
question highlighted:
[image: image.png]
and here's what you get when using rdCoordGen.AddCoords (which uses the
CoordGen algorithm instead of the RDKit's algorithm):
[image: image.png]
The InChI generated for that second example matches the input:

m3 = Chem.MolFromInchi(inchi_initial)
from rdkit.Chem import rdCoordGen
rdCoordGen.AddCoords(m3)
inchi_final2 = Chem.MolToInchi(m3)
print(inchi_initial == inchi_final2)  # prints "True"


Here's a gist showing a bit more detail:
https://gist.github.com/greglandrum/434ac2e7dc69bbe8c6a9c319ef284f69

The problems with the RDKit's 2D coordinate generation for macrocycles are
unfortunate, but unlikely to be resolved in the near future. In the
meantime, if you have macrocycles with trans double bonds and you need
coordinates with accurate stereochemistry, it's probably a better idea to
use the CoordGen algorithm.[1]

-greg
[1] CoordGen isn't the default, even when its installed, because it can be
slow relative to the RDKit algorithm.


On Thu, Jan 10, 2019 at 4:21 PM Jean-Marc Nuzillard <
jm.nuzill...@univ-reims.fr> wrote:

> Dear all,
>
> I wrote some time ago about adding 3D coordinates to atoms in a molecule
> that was created from an InChI string.
> The conversion of the molecule to InChI dis not produce the initial
> InChI due to the presence of an intracyclic double bond.
> I face them same problem with the generation of 2D coordinates:
>
> inchi_initial =
>
> "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1"
> m = Chem.MolFromInchi(inchi_initial)
> AllChem.Compute2DCoords(m)
> inchi_final = Chem.MolToInchi(m)
> print(inchi_initial)
> print(inchi_final)
> print(inchi_initial == inchi_final) # returns False
>
> Is there something I can do to avoid this?
> The difference between the initial and final InChI strings is in the
> geometry of the double bonds.
> All the best,
>
> Jean-Marc
>
> --
> Jean-Marc Nuzillard
> Directeur de Recherches au CNRS
>
> Institut de Chimie Moléculaire de Reims
> CNRS UMR 7312
> Moulin de la Housse
> CPCBAI, Bâtiment 18
> BP 1039
> 51687 REIMS Cedex 2
> France
>
> Tel : 03 26 91 82 10
> Fax : 03 26 91 31 66
> http://www.univ-reims.fr/ICMR
> http://eos.univ-reims.fr/LSD/CSNteam.html
>
> http://www.univ-reims.fr/LSD/
> http://www.univ-reims.fr/LSD/JmnSoft/
>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>
>
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to