Thinking about it a bit more, this "convert bonds to x/aromatic" functionality might be a good addition to the AdjustQueryProperties() function discussed here: http://rdkit.blogspot.ch/2015/08/tuning-substructure-queries.html and here: https://github.com/rdkit/rdkit/issues/567
-greg On Tue, Sep 8, 2015 at 4:48 AM, Greg Landrum <[email protected]> wrote: > Hi Rocco, > > Apologies for the slow reply; the RDKit UGM last week consumed all my > attention. > > What you are observing is a consequence of the RDKit's aromaticity model ( > http://rdkit.org/docs/RDKit_Book.html#aromaticity): the exocyclic double > bonds to O in quinone cause the 6-ring there to be non-aromatic. The dummy > atoms in the query molecule, on the other hand, do not perturb the > aromaticity of the ring. > > This is one of those edge cases that is currently not straightforward to > solve without either using SMARTS or adding query bonds ("single/aromatic" > and "double/aromatic") to the stub query. > > -greg > > > > On Thu, Sep 3, 2015 at 11:45 PM, Rocco Moretti <[email protected]> > wrote: > >> Hello, >> >> I'm seeing unexpected results when trying to match a search query encoded >> as an MDL Molfile. It looks like I'm not getting any matches when the >> oxygens of a quinone are replaced with placeholder atoms in an otherwise >> identical structure. >> >> That is, if I take the molfile for quinone, copy it and only change the >> 'O' atoms to '*' atoms, the query doesn't work, possibly due to aromaticity >> issues: >> >> >>> from rdkit import Chem >> >>> print rdkit.__version__ >> 2015.03.1 >> >>> m = Chem.MolFromMolFile("quinone_test.sdf") >> >>> q = Chem.MolFromMolFile("quinone_stub.sdf") >> >>> m.HasSubstructMatch(q) >> False >> >>> Chem.MolToSmiles(m) >> 'O=C1C=CC(=O)C=C1' >> >>> Chem.MolToSmiles(q) >> '[*]=c1ccc(=[*])cc1' >> >>> Chem.MolToSmarts(m) >> '[#8]=[#6]1-[#6]=[#6]-[#6](-[#6]=[#6]-1)=[#8]' >> >>> Chem.MolToSmarts(q) >> '*=[#6]1:[#6]:[#6]:[#6](:[#6]:[#6]:1)=*' >> >> Note I still have issues even if I load the query as a SMILES string: >> >> >>> q2 = Chem.MolFromSmiles("[*]=C1-C=C-C(=[*])-C=C1") >> >>> m.HasSubstructMatch(q2) >> False >> >>> Chem.MolToSmiles(q2) >> '[*]=c1ccc(=[*])cc1' >> >> But not when I load it as a SMARTS string: >> >> >>> q3 = Chem.MolFromSmarts("[*]=C1-C=C-C(=[*])-C=C1") >> >>> m.HasSubstructMatch(q3) >> True >> >>> Chem.MolToSmiles(q3) >> '[*]=C1C=CC(=[*])C=C1' >> >> As using SMARTS strings is not really feasible for what I'm doing, is >> there something I'm doing wrong with respect to loading query molecules >> from Molfiles? The structure is already single/double Kekulized in the >> molfile, so is there some flag or other loading function I should be using >> to avoid spurious aromatization? (Hopefully, one that's general enough that >> I won't have issues when loading and matching truly aromatic molecules.) >> >> Thanks, >> -Rocco >> >> P.S. My end usage will actually be using the C++ API, if that makes a >> difference for recommendations. >> >> ~~~~ >> >> ## quinone_test.sdf, for completeness (quinone_stub.sdf is identical, >> except for "*" instead of the two "O"): >> >> quinone >> comment 1 >> comment 2 >> 12 12 0 0 0 0 0 0 0 0999 V2000 >> 1.0263 -0.0278 -0.3487 O 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.2087 -0.0217 -0.0369 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.9446 1.2428 0.1576 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.2373 1.2490 0.4999 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.9841 -0.0093 0.6981 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 6.1658 -0.0035 1.0123 O 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.2483 -1.2741 0.5019 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.9564 -1.2801 0.1598 C 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.3826 2.1566 0.0087 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.7914 2.1678 0.6465 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 4.8110 -2.1878 0.6502 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 2.4019 -2.1992 0.0122 H 0 0 0 0 0 0 0 0 0 0 0 0 >> 1 2 2 0 0 0 0 >> 2 8 1 0 0 0 0 >> 2 3 1 0 0 0 0 >> 3 4 2 0 0 0 0 >> 3 9 1 0 0 0 0 >> 4 5 1 0 0 0 0 >> 4 10 1 0 0 0 0 >> 5 7 1 0 0 0 0 >> 5 6 2 0 0 0 0 >> 7 8 2 0 0 0 0 >> 7 11 1 0 0 0 0 >> 8 12 1 0 0 0 0 >> M END >> $$$$ >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Monitor Your Dynamic Infrastructure at Any Scale With Datadog! >> Get real-time metrics from all of your servers, apps and tools >> in one place. >> SourceForge users - Click here to start your Free Trial of Datadog now! >> http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140 >> _______________________________________________ >> Rdkit-discuss mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> >
------------------------------------------------------------------------------
_______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

