Hi Ansgar, This is still using the MolVS tautomer-handling code since we didn't finish the canonicalization part during last year's Google Summer of Code.[1] That means it's not using the parameter file that you found. The rules that are used are here: https://github.com/rdkit/rdkit/blob/master/rdkit/Chem/MolStandardize/tautomer.py
You can change those at runtime, but it you need to be careful to properly re-import modules after doing so. Here's an example showing how to do that: https://gist.github.com/greglandrum/4ac2b4e7f8c61e25836e106467aef150 I'm not going to claim that the SMARTS which I constructed to change the 1,3 (thio)ketol/enol is the right one, but it does at least show how to make the changes and reload the standardize module so that it takes effect. I hope this helps, -greg [1] and I haven't made it a priority because I dread the "no, that's not the right canonical tautomer" arguments that will ensue On Tue, Jul 23, 2019 at 8:54 AM Schuffenhauer, Ansgar < ansgar.schuffenha...@novartis.com> wrote: > Hi Greg > > > > Thanks for your quick answer. What I am doing is essentially the following: > > > > from rdkit.Chem import MolStandardize > > my_standardizer = MolStandardize.standardize.Standardizer() > > standard_tautomer = my_standardizer.tautomer_parent(input_mol) > > > > > > I assume that at the stage I construct my_standardizer there would be > some opportunity slip in an alternative configuration info > > > > By the way, I think also that one of the two cases of vanishing > stereo-chemistry reported in https://github.com/rdkit/rdkit/issues/2363 > > is caused by an overly eager keto/enol tautomerizer. > > > > Best regards > > > > Ansgar > > > > > > *Ansgar Schuffenhauer* > > Senior Investigator I > > T +41 79 608 9063 > > ansgar.schuffenha...@novartis.com > > > > *Novartis Pharma AG* > > NIBR > > > > *From:* Greg Landrum <greg.land...@gmail.com> > *Sent:* Montag, 22. Juli 2019 17:42 > *To:* Schuffenhauer, Ansgar <ansgar.schuffenha...@novartis.com> > *Cc:* rdkit-discuss@lists.sourceforge.net > *Subject:* Re: [Rdkit-discuss] Rdkit-discuss Digest, Vol 141, Issue 16 > > > > Hi Ansgar, > > > > It is possible to specify the tautomer parameter file that is used, but in > order for me to explain how, I need to know how you are currently using the > code to enumerate tautomers (i.e. which function you are calling). > > > > As for the format: it's tab-delimited and the first entry is the name. The > "r/f" flag is an indicator of which direction the transform is going that > is just there to make the name unique. > > In the SMARTS the first atom is the one with the mobile H and the last > atom is where it should be moved to. > > > > -greg > > > > > > > > On Mon, Jul 22, 2019 at 3:08 PM Schuffenhauer, Ansgar < > ansgar.schuffenha...@novartis.com> wrote: > > Dear all > > For the standardizer module (Chem.MolStandardize), what is the best way to > change some of the tautomerizer rules? > There is a data file in > share/RDKit/Data/Molstandardize/tautomerTransforms.in which I assume to > define the default. > > // Name SMARTS Bonds Charges > 1,3 (thio)keto/enol f [CX4!H0]-[C]=[O,S,Se,Te;X1] > 1,3 (thio)keto/enol r [O,S,Se,Te;X2!H0]-[C]=[C] > 1,5 (thio)keto/enol f [CX4,NX3;!H0]-[C]=[C][CH0]=[O,S,Se,Te;X1] > 1,5 (thio)keto/enol r [O,S,Se,Te;X2!H0]-[CH0]=[C]-[C]=[C,N] > ... > > Now my questions are > 1. What is the Syntax of this file? What does the "f" and the "r" stand > for? Do the smarts have to start with the atom carrying the mobile H? > 2. How can I instruct rdkit not to use this default file, but the one > supplied by the user. > > The background for this question that the smarts for keto/enol seems to be > a bit too generic, as it catches also the alpha C-atoms of carboxylic acids > and amides. Generation of tautomers here leads to a epimerization of > stereo-centers in alpha positions of carboxylic acids and amides. That > appears odd to me, as such stereo-centers are quite stable (in contrast to > those of "real" ketones and aldehydes). > > > Best regards > > Ansgar > > Ansgar Schuffenhauer > Senior Investigator I > T +41 79 608 9063 > ansgar.schuffenha...@novartis.com > > Novartis Pharma AG > NIBR > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.sourceforge.net_lists_listinfo_rdkit-2Ddiscuss&d=DwMFaQ&c=ZbgFmJjg4pdtrnL2HUJUDw&r=5QXEEnQo9VkJH7cIXFb_E4UmFhbbILws-P-WlR4_pzpv_6dQk_-xFQGH00p03i-I&m=uiXOLxD_7MgeeA9MyeUBlDB3ufzf53oBws3smVh4cc8&s=L4Bzk6_VPaAqyj_iM8_9rz9diujKH9rSgsNrvBa5958&e=> > >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss