It really boils down to how you standardize molecules such that you end up
with a canonical structure.

SMILES not the issue here - if you standardizer does a proper job with
aromaticity, tautomers etc then you can get a canonical SMILES.

You can use the InChI model as well as to generate a canonical SMILES (
https://jcheminf.springeropen.com/articles/10.1186/1758-2946-4-22).

This doesn't really answer your question, as I'm not familiar with RDKit
functionality for standardization.

(As an aside, internally we use https://github.com/ncats/lychi which is
conceptually similar to InChI)

PS. I don't think this is a job for fingerprint based similarity methods
though

On Mon, Nov 28, 2016 at 11:25 AM, Stephen O'hagan <soha...@manchester.ac.uk>
wrote:

> Has anyone come up with fool-proof way of matching structurally equivalent
> molecules?
>
>
>
> Unique Smiles or InChI String comparisons don’t appear to work presumable
> because there are different but equivalent structures, e.g. explicit vs
> non-explicit H’s, Kekule vs Aromatic, isomeric forms vs non-isomeric form,
> tautomers etc.
>
>
>
> I also expect that comparing InChI strings might need something more than
> just a simple string comparison, such as masking off stereo information
> when you don’t care about stereo isomers.
>
>
>
> I assume there are suitable tools within RDKit that can do this?
>
>
>
> N.B. I need to collate tables from several sources that have a mix of
> smiles / InChI / sdf molecular representations.
>
>
>
> I usually use RDKit via Python and/or Knime.
>
>
>
> Cheers,
>
> Steve.
>
>
>
> ------------------------------------------------------------
> ------------------
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>


-- 
Rajarshi Guha | http://blog.rguha.net
NIH Center for Advancing Translational Science
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to