Hi Igor,

On Wed, Jan 29, 2014 at 2:04 PM, Igor Filippov <[email protected]>wrote:

> Greg et al,
>
> Here is a little script that demonstrates a problem with fingerprints
> after the roundtrip through InChI.
> My input mol file is also attached.
> As you can see the similarity between "before" and "after" is not 1 in 45
> out of 100 cases.
> In one case it is as low as 0.29. Could someone take a look and tell me
> what I'm doing wrong?
>

Ah! Now I see what you're doing and understand the problem.

It's really important when using InChI to remember that InChI is designed
to be an identifier, not an interchange format. The InChI algorithm
modifies the molecule as part of its canonicalization step. This
modification includes standardizing tautomers.

Here's an example of the type of substructure modification that happens in
your molecules:
input smiles c1ccccc1C(=O)Nc1ccccc1 on begin converted to InChI and back
yields: OC(=Nc1ccccc1)c1ccccc1

Basically: If you think you know what your molecules are, you probably
should be building them from SMILES or CTAB, not InChI.

Apologies that I didn't think of this before; I was just focusing on the
stereochemistry.

-greg
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to