Thinking about it a bit more, this "convert bonds to x/aromatic"
functionality might be a good addition to the AdjustQueryProperties()
function discussed here:
http://rdkit.blogspot.ch/2015/08/tuning-substructure-queries.html
and here:
https://github.com/rdkit/rdkit/issues/567

-greg


On Tue, Sep 8, 2015 at 4:48 AM, Greg Landrum <[email protected]> wrote:

> Hi Rocco,
>
> Apologies for the slow reply; the RDKit UGM last week consumed all my
> attention.
>
> What you are observing is a consequence of the RDKit's aromaticity model (
> http://rdkit.org/docs/RDKit_Book.html#aromaticity): the exocyclic double
> bonds to O in quinone cause the 6-ring there to be non-aromatic. The dummy
> atoms in the query molecule, on the other hand, do not perturb the
> aromaticity of the ring.
>
> This is one of those edge cases that is currently not straightforward to
> solve without either using SMARTS or adding query bonds ("single/aromatic"
> and "double/aromatic") to the stub query.
>
> -greg
>
>
>
> On Thu, Sep 3, 2015 at 11:45 PM, Rocco Moretti <[email protected]>
> wrote:
>
>> Hello,
>>
>> I'm seeing unexpected results when trying to match a search query encoded
>> as an MDL Molfile. It looks like I'm not getting any matches when the
>> oxygens of a quinone are replaced with placeholder atoms in an otherwise
>> identical structure.
>>
>> That is, if I take the molfile for quinone, copy it and only change the
>> 'O' atoms to '*' atoms, the query doesn't work, possibly due to aromaticity
>> issues:
>>
>> >>> from rdkit import Chem
>> >>> print rdkit.__version__
>> 2015.03.1
>> >>> m = Chem.MolFromMolFile("quinone_test.sdf")
>> >>> q = Chem.MolFromMolFile("quinone_stub.sdf")
>> >>> m.HasSubstructMatch(q)
>> False
>> >>> Chem.MolToSmiles(m)
>> 'O=C1C=CC(=O)C=C1'
>> >>> Chem.MolToSmiles(q)
>> '[*]=c1ccc(=[*])cc1'
>> >>> Chem.MolToSmarts(m)
>> '[#8]=[#6]1-[#6]=[#6]-[#6](-[#6]=[#6]-1)=[#8]'
>> >>> Chem.MolToSmarts(q)
>> '*=[#6]1:[#6]:[#6]:[#6](:[#6]:[#6]:1)=*'
>>
>> Note I still have issues even if I load the query as a SMILES string:
>>
>> >>> q2 = Chem.MolFromSmiles("[*]=C1-C=C-C(=[*])-C=C1")
>> >>> m.HasSubstructMatch(q2)
>> False
>> >>> Chem.MolToSmiles(q2)
>> '[*]=c1ccc(=[*])cc1'
>>
>> But not when I load it as a SMARTS string:
>>
>> >>> q3 = Chem.MolFromSmarts("[*]=C1-C=C-C(=[*])-C=C1")
>> >>> m.HasSubstructMatch(q3)
>> True
>> >>> Chem.MolToSmiles(q3)
>> '[*]=C1C=CC(=[*])C=C1'
>>
>> As using SMARTS strings is not really feasible for what I'm doing, is
>> there something I'm doing wrong with respect to loading query molecules
>> from Molfiles? The structure is already single/double Kekulized in the
>> molfile, so is there some flag or other loading function I should be using
>> to avoid spurious aromatization? (Hopefully, one that's general enough that
>> I won't have issues when loading and matching truly aromatic molecules.)
>>
>> Thanks,
>> -Rocco
>>
>> P.S. My end usage will actually be using the C++ API, if that makes a
>> difference for recommendations.
>>
>> ~~~~
>>
>> ## quinone_test.sdf, for completeness (quinone_stub.sdf is identical,
>> except for "*" instead of the two "O"):
>>
>> quinone
>> comment 1
>> comment 2
>>  12 12  0  0  0  0  0  0  0  0999 V2000
>>     1.0263   -0.0278   -0.3487 O   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.2087   -0.0217   -0.0369 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.9446    1.2428    0.1576 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     4.2373    1.2490    0.4999 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     4.9841   -0.0093    0.6981 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     6.1658   -0.0035    1.0123 O   0  0  0  0  0  0  0  0  0  0  0  0
>>     4.2483   -1.2741    0.5019 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.9564   -1.2801    0.1598 C   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.3826    2.1566    0.0087 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     4.7914    2.1678    0.6465 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     4.8110   -2.1878    0.6502 H   0  0  0  0  0  0  0  0  0  0  0  0
>>     2.4019   -2.1992    0.0122 H   0  0  0  0  0  0  0  0  0  0  0  0
>>   1  2  2  0  0  0  0
>>   2  8  1  0  0  0  0
>>   2  3  1  0  0  0  0
>>   3  4  2  0  0  0  0
>>   3  9  1  0  0  0  0
>>   4  5  1  0  0  0  0
>>   4 10  1  0  0  0  0
>>   5  7  1  0  0  0  0
>>   5  6  2  0  0  0  0
>>   7  8  2  0  0  0  0
>>   7 11  1  0  0  0  0
>>   8 12  1  0  0  0  0
>> M  END
>> $$$$
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
>> Get real-time metrics from all of your servers, apps and tools
>> in one place.
>> SourceForge users - Click here to start your Free Trial of Datadog now!
>> http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
>> _______________________________________________
>> Rdkit-discuss mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
------------------------------------------------------------------------------
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to