Hi Rocco,

On Fri, Jun 15, 2018 at 3:29 PM Rocco Moretti <[email protected]> wrote:

>
> Is there an easy way from within RDKit to take an arbitrary amide tautomer
> and convert it to the "correct" (according to chemists) one?
>

I suspect it's tricky to define a transformation that handles an arbitrary
tautomer, but dealing with this specific one isn't too hard:

In [4]: ims = [Chem.MolFromSmiles(x) for x in ('C(O)=N','C(O)=NC')]

In [5]: tf =
AllChem.ReactionFromSmarts('[C:1](-[OH:2])=[N:3]>>[C:1](=[O:2])-[N:3]')

In [7]: ps = [tf.RunReactants((x,))[0][0] for x in ims]

In [8]: _ = [Chem.SanitizeMol(x) for x in ps]

In [9]: [Chem.MolToSmiles(x) for x in ps]
Out[9]: ['NC=O', 'CNC=O']


-greg



>
> On Fri, Jun 15, 2018 at 12:26 AM, Markus Sitzmann <
> [email protected]> wrote:
>
>> Hi Jeff,
>>
>> That is because InChI is a structure identifier, not a structure
>> representation. The difference of both is, a structure identifier
>> normalizes the structure to a form which it regards as the standard
>> representation of the molecule in order to make the molecule identifiable
>> regardless of the state the molecule is coming in from a input resource
>> (and hence calculates the same identifier).
>>
>> For Standard InChI, the decision was made to make them insensitive to
>> tautomers (within the limitations of the InChI algorithm). Kind of
>> unluckily, this normalizes most amides to a form that chemists regard as
>> the incorrect one. And the second unlucky thing is that you can convert the
>> InChI back to a structure representation which then  is of course the
>> normalized or standardized form of the molecule.
>>
>> So if you want to make sure to keep the original representation of a
>> molecule don’t use InChI as your representation format (calculate InChI as
>> an identifier field next to it). If your input resource only provides InChI
>> or Standard InChI then your are of course out of luck.
>>
>> Best,
>> Markus
>>
>> -------------------------------------
>> |  Markus Sitzmann
>> |  [email protected]
>>
>> On 14. Jun 2018, at 23:33, Jeff van Santen <[email protected]>
>> wrote:
>>
>> Hi all,
>>
>>
>> I have some questions about how remit handles amides. For context, I am
>> working with a large set of molecules, many of which contain peptides. I
>> have been running into a problem with using rdkit, in that when I try to
>> load a molecule from the InChI, the wrong tautomer is loaded. As a simple
>> example consider acetamide:
>>
>>
>> """
>>
>> FromInchi = Chem.MolFromInchi('InChI=1S/C2H5NO/c1-2(3)4/h1H3,(H2,3,4)')
>>
>> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
>>
>>  > 0
>>
>> print(Chem.MolToSmiles(FromInchi))
>>
>> > CC(=N)O
>>
>>
>> FromSmiles = Chem.MolFromSmiles('CC(=O)N')
>>
>> print(rdMolDescriptors.CalcNumAmideBonds(FromInchi))
>>
>> > 1
>>
>> print(Chem.MolToSmiles(FromSmiles))
>>
>> > CC(=N)O
>>
>> """
>>
>>
>> I realize that Standard InChi does not have a mechanism for
>> distinguishing between the two tautomers, so I am wondering why rdkit
>> considers the iminol to be a better representation? Also, there is anyway
>> to get the amide instead? (Without using MolVS)
>>
>>
>> Thanks,
>>
>> Jeff
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to