Hi Jean-Marc, The problem here is being caused by a bug/deficiency in the RDKit's 2D coordinate generation algorithm. The molecule described by the input InChI.has a trans double bond in the macrocycle and Compute2DCoords() isn't handling that correctly. Here are the coordinates you get for that molecule, with the bond in question highlighted: [image: image.png] and here's what you get when using rdCoordGen.AddCoords (which uses the CoordGen algorithm instead of the RDKit's algorithm): [image: image.png] The InChI generated for that second example matches the input:
m3 = Chem.MolFromInchi(inchi_initial) from rdkit.Chem import rdCoordGen rdCoordGen.AddCoords(m3) inchi_final2 = Chem.MolToInchi(m3) print(inchi_initial == inchi_final2) # prints "True" Here's a gist showing a bit more detail: https://gist.github.com/greglandrum/434ac2e7dc69bbe8c6a9c319ef284f69 The problems with the RDKit's 2D coordinate generation for macrocycles are unfortunate, but unlikely to be resolved in the near future. In the meantime, if you have macrocycles with trans double bonds and you need coordinates with accurate stereochemistry, it's probably a better idea to use the CoordGen algorithm.[1] -greg [1] CoordGen isn't the default, even when its installed, because it can be slow relative to the RDKit algorithm. On Thu, Jan 10, 2019 at 4:21 PM Jean-Marc Nuzillard < jm.nuzill...@univ-reims.fr> wrote: > Dear all, > > I wrote some time ago about adding 3D coordinates to atoms in a molecule > that was created from an InChI string. > The conversion of the molecule to InChI dis not produce the initial > InChI due to the presence of an intracyclic double bond. > I face them same problem with the generation of 2D coordinates: > > inchi_initial = > > "InChI=1S/C20H26O4/c1-12(2)17-11-18(22)14(4)7-5-6-13(3)8-16(21)9-15-10-19(17)24-20(15)23/h6-7,10,12,17,19H,5,8-9,11H2,1-4H3/b13-6-,14-7+/t17-,19-/m1/s1" > m = Chem.MolFromInchi(inchi_initial) > AllChem.Compute2DCoords(m) > inchi_final = Chem.MolToInchi(m) > print(inchi_initial) > print(inchi_final) > print(inchi_initial == inchi_final) # returns False > > Is there something I can do to avoid this? > The difference between the initial and final InChI strings is in the > geometry of the double bonds. > All the best, > > Jean-Marc > > -- > Jean-Marc Nuzillard > Directeur de Recherches au CNRS > > Institut de Chimie Moléculaire de Reims > CNRS UMR 7312 > Moulin de la Housse > CPCBAI, Bâtiment 18 > BP 1039 > 51687 REIMS Cedex 2 > France > > Tel : 03 26 91 82 10 > Fax : 03 26 91 31 66 > http://www.univ-reims.fr/ICMR > http://eos.univ-reims.fr/LSD/CSNteam.html > > http://www.univ-reims.fr/LSD/ > http://www.univ-reims.fr/LSD/JmnSoft/ > > > --- > L'absence de virus dans ce courrier électronique a été vérifiée par le > logiciel antivirus Avast. > https://www.avast.com/antivirus > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss