Hello,

I think it's a bug because the tautomers depend on how the input SMILES is
written. Both represent mol1:

Sc1ncc2c(c1)cccc2
Sc1cc2ccccc2cn1

However the resulting tautomers differ depending on which is used as input.

Best regards,
Diogo

On Mon, 5 Feb 2024 at 11:38, Lewis Martin <lewis.marti...@gmail.com> wrote:

> Thank you very much for the detective work, Wim! This is helpful.
>
> It looks like the _reverse_ transition is possible, though. If I start by
> generating tautomers of "mol2", then "mol1" is recovered, which indicates
> this is an allowed transform. Is it possible that one direction is allowed
> but not the reverse?
>
> Failing a solution there, does anyone know if it is possible to add SMIRKS
> to the allowed tautomers through the python interface?
> Thanks,
> Lewis
>
> On Mon, Feb 5, 2024 at 9:52 PM Wim Dehaen <wimdeh...@gmail.com> wrote:
>
>> hi lewis,
>> if i am not mistaken this is because the tautomer transfor "1,3 aromatic
>> heteroatom H shift" does not account for other chalcogens than oxygen, so
>> no selenium, tellurium or sulfur.
>> you can find the list of transforms here:
>> https://github.com/rdkit/rdkit/blob/8dae48b7a17fd984c69d04549e6d9b53690f5c52/Code/GraphMol/MolStandardize/TautomerCatalog/tautomerTransforms.in#L46
>> (poiting to the line with the relevant transform).
>> best wishes
>> wim
>>
>> On Mon, Feb 5, 2024 at 3:26 AM Lewis Martin <lewis.marti...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>> I'm looking at scoring tautomers, and using the 'tautobase' dataset used
>>> by Weider et al* at:
>>>
>>> https://github.com/choderalab/neutromeratio/blob/master/data/b3lyp_tautobase_subset.txt
>>>
>>> This dataset has pairs of tautomers with experimental logK values to
>>> determine the preferred tautomer.
>>>
>>> In at least one case, depending on which tautomer you use as the 'entry'
>>> point, the enumerated tautomers by RDKit either do or don't include both of
>>> the pair of input molecules. *I'm hoping there's a way to uniquely
>>> recover the full set of possible tautomers from using any input tautomer. *
>>>
>>> Here's a code example:
>>>
>>> from rdkit import Chem
>>>>
>>> from rdkit.Chem import Draw
>>>
>>> from rdkit.Chem.Draw import IPythonConsole
>>>> IPythonConsole.drawOptions.addStereoAnnotation = True
>>>> from rdkit.Chem.MolStandardize import rdMolStandardize
>>>>
>>>> #same result if you don't do any of these params.
>>>
>>> tautomer_params =
>>>> Chem.MolStandardize.rdMolStandardize.CleanupParameters()
>>>> tautomer_params.tautomerRemoveSp3Stereo = False
>>>> tautomer_params.tautomerRemoveBondStereo = False
>>>> tautomer_params.tautomerRemoveIsotopicHs = False
>>>> tautomer_params.tautomerReassignStereo = False
>>>> tautomer_params.doCanonical = True
>>>>
>>>> enumerator = rdMolStandardize.TautomerEnumerator(tautomer_params)
>>>>
>>>> smi1 = 'Sc1cc2ccccc2cn1'
>>>> smi2 = 'S=c1cc2ccccc2c[nH]1'
>>>> mol1 = Chem.MolFromSmiles(smi1)
>>>> mol2 = Chem.MolFromSmiles(smi2)
>>>>
>>>> #choose mol1 or mol2 to be source of tautomers:
>>>> #choose mol1, and look at the tautomers. Note that mol2 isn't present!
>>>> tauts = [Chem.MolFromSmiles(Chem.MolToSmiles(m)) for m in
>>>> enumerator.Enumerate(mol1)]
>>>>
>>>> Draw.MolsToGridImage([mol1, mol2]+tauts, legends=['mol1', 'mol2 (not
>>>> present in tauts!)'] + [f'taut{i}' for i in range(len(tauts))],
>>>>                      molsPerRow=4)
>>>>
>>>
>>> And a picture of this in a notebook for an at-a-glance view:
>>> https://gist.github.com/ljmartin/4a9d9eb684df3e11e59fc6502a4b7b03
>>>
>>> Does anyone know a way to recover "mol2" within tautomers of "mol1"?
>>>
>>> Thank you!
>>> Lewis
>>>
>>>
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to