On Thu, Sep 14, 2017 at 8:09 PM, Jason Biggs <jasondbi...@gmail.com> wrote:

> Okay, all three of these smiles strings resolve to the same inchi,
> "O=[N+](C1=NC2=CC=CC=C2N=C1)[N-](=O)C1=NC2=CC=CC=C2N=C1"
> "C1=CC=C2C(=C1)N=CC(=N2)N(=N(=O)C3=NC4=CC=CC=C4N=C3)=O"
> "[O-][N+](c1cnc2ccccc2n1)=[N+]([O-])c3cnc4ccccc4n3"
> even though to me they seem like different structures due to the specified
> charges.  Is this a limitation of inchi, or do I need to rethink my ideas
> of what makes two chemical structures the same?
Well, but at least the first two ones I would regard as erroneous or
unlikely (not stable) creatures - and that is exactly what John meant with
InChI is an identifier, not a representation. InChI's main purpose
(particularly that one of Standard InChI) is to identify them as the same
(corrected, normalized) molecule, not as three separate species (that would
be the purpose of representation). Of course, in many cases, there might be
a discussion avout where sensible correction/normalization should end and
separation of structures should start but that is long topic.
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Rdkit-discuss mailing list

Reply via email to