Hello, I think it's a bug because the tautomers depend on how the input SMILES is written. Both represent mol1:
Sc1ncc2c(c1)cccc2 Sc1cc2ccccc2cn1 However the resulting tautomers differ depending on which is used as input. Best regards, Diogo On Mon, 5 Feb 2024 at 11:38, Lewis Martin <lewis.marti...@gmail.com> wrote: > Thank you very much for the detective work, Wim! This is helpful. > > It looks like the _reverse_ transition is possible, though. If I start by > generating tautomers of "mol2", then "mol1" is recovered, which indicates > this is an allowed transform. Is it possible that one direction is allowed > but not the reverse? > > Failing a solution there, does anyone know if it is possible to add SMIRKS > to the allowed tautomers through the python interface? > Thanks, > Lewis > > On Mon, Feb 5, 2024 at 9:52 PM Wim Dehaen <wimdeh...@gmail.com> wrote: > >> hi lewis, >> if i am not mistaken this is because the tautomer transfor "1,3 aromatic >> heteroatom H shift" does not account for other chalcogens than oxygen, so >> no selenium, tellurium or sulfur. >> you can find the list of transforms here: >> https://github.com/rdkit/rdkit/blob/8dae48b7a17fd984c69d04549e6d9b53690f5c52/Code/GraphMol/MolStandardize/TautomerCatalog/tautomerTransforms.in#L46 >> (poiting to the line with the relevant transform). >> best wishes >> wim >> >> On Mon, Feb 5, 2024 at 3:26 AM Lewis Martin <lewis.marti...@gmail.com> >> wrote: >> >>> Hi all, >>> I'm looking at scoring tautomers, and using the 'tautobase' dataset used >>> by Weider et al* at: >>> >>> https://github.com/choderalab/neutromeratio/blob/master/data/b3lyp_tautobase_subset.txt >>> >>> This dataset has pairs of tautomers with experimental logK values to >>> determine the preferred tautomer. >>> >>> In at least one case, depending on which tautomer you use as the 'entry' >>> point, the enumerated tautomers by RDKit either do or don't include both of >>> the pair of input molecules. *I'm hoping there's a way to uniquely >>> recover the full set of possible tautomers from using any input tautomer. * >>> >>> Here's a code example: >>> >>> from rdkit import Chem >>>> >>> from rdkit.Chem import Draw >>> >>> from rdkit.Chem.Draw import IPythonConsole >>>> IPythonConsole.drawOptions.addStereoAnnotation = True >>>> from rdkit.Chem.MolStandardize import rdMolStandardize >>>> >>>> #same result if you don't do any of these params. >>> >>> tautomer_params = >>>> Chem.MolStandardize.rdMolStandardize.CleanupParameters() >>>> tautomer_params.tautomerRemoveSp3Stereo = False >>>> tautomer_params.tautomerRemoveBondStereo = False >>>> tautomer_params.tautomerRemoveIsotopicHs = False >>>> tautomer_params.tautomerReassignStereo = False >>>> tautomer_params.doCanonical = True >>>> >>>> enumerator = rdMolStandardize.TautomerEnumerator(tautomer_params) >>>> >>>> smi1 = 'Sc1cc2ccccc2cn1' >>>> smi2 = 'S=c1cc2ccccc2c[nH]1' >>>> mol1 = Chem.MolFromSmiles(smi1) >>>> mol2 = Chem.MolFromSmiles(smi2) >>>> >>>> #choose mol1 or mol2 to be source of tautomers: >>>> #choose mol1, and look at the tautomers. Note that mol2 isn't present! >>>> tauts = [Chem.MolFromSmiles(Chem.MolToSmiles(m)) for m in >>>> enumerator.Enumerate(mol1)] >>>> >>>> Draw.MolsToGridImage([mol1, mol2]+tauts, legends=['mol1', 'mol2 (not >>>> present in tauts!)'] + [f'taut{i}' for i in range(len(tauts))], >>>> molsPerRow=4) >>>> >>> >>> And a picture of this in a notebook for an at-a-glance view: >>> https://gist.github.com/ljmartin/4a9d9eb684df3e11e59fc6502a4b7b03 >>> >>> Does anyone know a way to recover "mol2" within tautomers of "mol1"? >>> >>> Thank you! >>> Lewis >>> >>> >>> _______________________________________________ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss