It really boils down to how you standardize molecules such that you end up with a canonical structure.
SMILES not the issue here - if you standardizer does a proper job with aromaticity, tautomers etc then you can get a canonical SMILES. You can use the InChI model as well as to generate a canonical SMILES ( https://jcheminf.springeropen.com/articles/10.1186/1758-2946-4-22). This doesn't really answer your question, as I'm not familiar with RDKit functionality for standardization. (As an aside, internally we use https://github.com/ncats/lychi which is conceptually similar to InChI) PS. I don't think this is a job for fingerprint based similarity methods though On Mon, Nov 28, 2016 at 11:25 AM, Stephen O'hagan <soha...@manchester.ac.uk> wrote: > Has anyone come up with fool-proof way of matching structurally equivalent > molecules? > > > > Unique Smiles or InChI String comparisons don’t appear to work presumable > because there are different but equivalent structures, e.g. explicit vs > non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form, > tautomers etc. > > > > I also expect that comparing InChI strings might need something more than > just a simple string comparison, such as masking off stereo information > when you don’t care about stereo isomers. > > > > I assume there are suitable tools within RDKit that can do this? > > > > N.B. I need to collate tables from several sources that have a mix of > smiles / InChI / sdf molecular representations. > > > > I usually use RDKit via Python and/or Knime. > > > > Cheers, > > Steve. > > > > ------------------------------------------------------------ > ------------------ > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Rajarshi Guha | http://blog.rguha.net NIH Center for Advancing Translational Science
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss