Hi all,
I'm looking at scoring tautomers, and using the 'tautobase' dataset used by
Weider et al* at:
https://github.com/choderalab/neutromeratio/blob/master/data/b3lyp_tautobase_subset.txt

This dataset has pairs of tautomers with experimental logK values to
determine the preferred tautomer.

In at least one case, depending on which tautomer you use as the 'entry'
point, the enumerated tautomers by RDKit either do or don't include both of
the pair of input molecules. *I'm hoping there's a way to uniquely recover
the full set of possible tautomers from using any input tautomer. *

Here's a code example:

from rdkit import Chem
>
from rdkit.Chem import Draw

from rdkit.Chem.Draw import IPythonConsole
> IPythonConsole.drawOptions.addStereoAnnotation = True
> from rdkit.Chem.MolStandardize import rdMolStandardize
>
> #same result if you don't do any of these params.

tautomer_params = Chem.MolStandardize.rdMolStandardize.CleanupParameters()
> tautomer_params.tautomerRemoveSp3Stereo = False
> tautomer_params.tautomerRemoveBondStereo = False
> tautomer_params.tautomerRemoveIsotopicHs = False
> tautomer_params.tautomerReassignStereo = False
> tautomer_params.doCanonical = True
>
> enumerator = rdMolStandardize.TautomerEnumerator(tautomer_params)
>
> smi1 = 'Sc1cc2ccccc2cn1'
> smi2 = 'S=c1cc2ccccc2c[nH]1'
> mol1 = Chem.MolFromSmiles(smi1)
> mol2 = Chem.MolFromSmiles(smi2)
>
> #choose mol1 or mol2 to be source of tautomers:
> #choose mol1, and look at the tautomers. Note that mol2 isn't present!
> tauts = [Chem.MolFromSmiles(Chem.MolToSmiles(m)) for m in
> enumerator.Enumerate(mol1)]
>
> Draw.MolsToGridImage([mol1, mol2]+tauts, legends=['mol1', 'mol2 (not
> present in tauts!)'] + [f'taut{i}' for i in range(len(tauts))],
>                      molsPerRow=4)
>

And a picture of this in a notebook for an at-a-glance view:
https://gist.github.com/ljmartin/4a9d9eb684df3e11e59fc6502a4b7b03

Does anyone know a way to recover "mol2" within tautomers of "mol1"?

Thank you!
Lewis
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to