"IIRC, [Roger] gives an example of a large chemical supplier who offered
two tautomers of the same compound for sale at very different prices which
is at least embarrassing."
On the other hand, it provides a great opportunity for arbitrage. ;-)
-P.
On Tue, Apr 18, 2017 at 4:02 AM, David Cosgrove
wrote:
> Hi JW et al.,
> One of the last things I worked on before leaving AZ was what we called a
> tautomer-independent molecular representation. What we meant by this was a
> way of spotting whether a new compound being registered into the corporate
> collectin was a tautomer of one already in the database. As part of that,
> I looked at the InChi representation and the tautomer handling which was at
> that point labelled experimental. In our view, it was very limited in the
> types of tautomers it represented and not adequate to our needs. As a
> result I developed a program called tt_tauts, which AZ "open-sourced" when
> they made me redundant, and is available at https://github.com/OpenEye-
> Contrib/TT_Tauts. It's another plug for OEChem, I'm afraid, which seems
> poor form on the RDKit website, but there you go. It is also a long way
> from being complete, and I am still working on it as a somewhat masochistic
> hobby. Internally at CozChemIx Towers it is known as 'The Mole Project' in
> honour of the game 'Whac-A-Mole' (https://en.wikipedia.org/
> wiki/Whac-A-Mole) - every time you squash an odd tautomer case, another
> one pops up, quite often one you've already dealt with. Chembl is a
> marvelous source of nasty test cases. I hope to have a better version on
> github soon and also a description of the algorithm on my website. It used
> as a jumping off point the work of Thalheim et al. (
> http://onlinelibrary.wiley.com/doi/10.1002/minf.201400128/full).
> Note that this use of tautomer enumeration/representation is somewhat
> different from that of quacpac or taut_enum. These last two are concerned
> with predicting tautomers likely to be present in water (well, blood,
> probably) at roughly neutral pH, the first is trying to deal with two
> chemists drawing the same compound in different tautomers which may look
> quite different, with the hydrogen atoms shifted a long way. Both are
> difficult and unsolved problems. In one of Roger Sayle's papers on
> tautomers, IIRC, he gives an example of a large chemical supplier who
> offered two tautomers of the same compound for sale at very different
> prices which is at least embarrassing.
> Cheers,
> Dave
>
> On Tue, Apr 18, 2017 at 1:23 AM, JW Feng wrote:
>
>> Hi Maria,
>>
>> From looking at Roger's slides on https://github.com/rdkit/UGM_2
>> 016/blob/master/Presentations/Sayle_RDKitTautomers.pdf. Is he making an
>> argument that InChi values are insufficient in generating a canonical
>> string for different tautomers? What if you perform a set of
>> standardization transformation prior to generating InChi values? You may
>> want to look at how Genentech normalizes molecules for compound
>> registration. The code is based on OEChem and is open sourced on Github
>> https://github.com/chemalot/chemalot. This package is actively being
>> developed and I am a contributor. Specifically, you'll want to look at the
>> extensive standardization transformations in
>> https://github.com/chemalot/chemalot/blob/master/src/com/gen
>> entech/struchk/oeStruchk/Struchk.xml
>>
>> The last step in Struchk.xml is creating a canonical tautomer using
>> OpenEye's QuacPac toolkit. QuacPac returns a canonical tautomer. Could
>> one replace this step by converting a standardized molecule to InChi and
>> the back? Another approach is using Dave Cosgrove's TautEnum package (
>> https://github.com/OpenEye-Contrib/TautEnum). Both QuacPac and TautEnum
>> enumerates tautomers. I believe that Roger is intimately familiar with
>> QuacPac
>>
>> Best,
>>
>> JW
>>
>> ___
>> JW Feng, Ph.D.
>> Denali Therapeutics Inc.
>> 151 Oyster Point Blvd, 2nd Floor, South San Francisco, CA 94080 | (650)
>> 270-0628
>>
>> On Tue, Apr 11, 2017 at 6:52 AM, > ourceforge.net> wrote:
>>
>>> Send Rdkit-discuss mailing list submissions to
>>> rdkit-discuss@lists.sourceforge.net
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>> or, via email, send a message with subject or body 'help' to
>>> rdkit-discuss-requ...@lists.sourceforge.net
>>>
>>> You can reach the person managing the list at
>>> rdkit-discuss-ow...@lists.sourceforge.net
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Rdkit-discuss digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>1. tautomers in rdkit (MARIA BRANDL)
>>>2. Re: tautomers in rdkit (Peter S. Shenkin)
>>>3. official Tripos MOL2 file format PDF document (Francois BERENGER)
>>>
>>>
>>>