Hi Jeff,

That is because InChI is a structure identifier, not a structure 
representation. The difference of both is, a structure identifier normalizes 
the structure to a form which it regards as the standard representation of the 
molecule in order to make the molecule identifiable regardless of the state the 
molecule is coming in from a input resource (and hence calculates the same 
identifier).

For Standard InChI, the decision was made to make them insensitive to tautomers 
(within the limitations of the InChI algorithm). Kind of unluckily, this 
normalizes most amides to a form that chemists regard as the incorrect one. And 
the second unlucky thing is that you can convert the InChI back to a structure 
representation which then  is of course the normalized or standardized form of 
the molecule. 

So if you want to make sure to keep the original representation of a molecule 
don’t use InChI as your representation format (calculate InChI as an identifier 
field next to it). If your input resource only provides InChI or Standard InChI 
then your are of course out of luck.

Best,
Markus

-------------------------------------
|  Markus Sitzmann
|  markus.sitzm...@gmail.com

> On 14. Jun 2018, at 23:33, Jeff van Santen <jeffrey_van_san...@sfu.ca> wrote:
> 
> Hi all,
> 
> 
> I have some questions about how remit handles amides. For context, I am 
> working with a large set of molecules, many of which contain peptides. I have 
> been running into a problem           with using rdkit, in that when I try to 
> load a molecule from the InChI, the wrong tautomer is loaded. As a simple 
> example consider acetamide:
> 
> 
> """
> 
> FromInchi = Chem.MolFromInchi('InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)')
> 
> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
> 
>  > 0
> 
> print(Chem.MolToSmiles(FromInchi))
> 
> > CC(=N)O
> 
> 
> 
> FromSmiles = Chem.MolFromSmiles('CC(=O)N')
> 
> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
> 
> > 1
> 
> print(Chem.MolToSmiles(FromSmiles))
> 
> > CC(=N)O
> 
> """
> 
> 
> I realize that Standard InChi does not have a mechanism for distinguishing 
> between the two tautomers, so I am wondering why rdkit considers the iminol 
> to be a better representation? Also, there is anyway to get the amide 
> instead? (Without using MolVS)
> 
> 
> Thanks,
> 
> Jeff
> 
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to