This isn't a really straightforward one.

One solution (and I think the best one) is to change the aromaticity model
used to match whatever is generating the hits (in your case it's the Symyx
cartridge).
The RDKit has functionality to do this already when you call the
SetAromaticity() function:

In [29]: m2 = Chem.MolFromMolFile('./CHEMBL25.mol',sanitize=False)

In [30]: Chem.SanitizeMol(m2,Chem.SANITIZE_ALL^Chem.SANITIZE_SETAROMATICITY)
Out[30]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [31]: Chem.SetAromaticity(m2,Chem.AROMATICITY_SIMPLE)


The problem here is that there isn't an aromaticity model there for
MDL/Symyx. This would be a useful thing to have and can be done quickly. If
someone can describe the aromaticity model to me, or point me to a
description of it, I can add it for the next release (which happens soon).

Another solution that I think would work is to read the query molecule in
without doing aromaticity perception (see input line 30 above) and then to
convert all the bonds to either single-or-aromatic or double-or-aromatic
queries using the approaches described here:
http://rdkit.blogspot.ch/2015/08/tuning-substructure-queries.html
and here:
http://rdkit.blogspot.ch/2016/07/tuning-substructure-queries-ii.html

Unfortunately the AdjustQueryParameters function doesn't have anything that
helps with the kind of bond queries you need, so you'd need to make the
bond changes in your code. If you want to go down this road and it's not
clear how to do so, let me know and I can post some sample code. I'm afraid
it's not completely trivial with bond queries

-greg



On Wed, Sep 13, 2017 at 4:42 PM, Michał Nowotka <mmm...@gmail.com> wrote:

> Is there any flag in RDkit to match both 'normal' aspirin and embedded
> aromatic analogues?
> The problem is that I can't modify user queries by hand in real time :)
>
> On Wed, Sep 13, 2017 at 2:12 PM, Chris Earnshaw <cgearns...@gmail.com>
> wrote:
> > Hi
> >
> > The problem is due to RDkit perceiving the embedded pyranone in
> > CHEMBL1999443 as an aromatic system, which is probably correct. However,
> in
> > the structure of aspirin the carboxyl carbon and singly bonded oxygen are
> > non-aromatic, so if you just use the SMILES of aspirin as a query it
> won't
> > match CHEMBL1999443
> >
> > You'll need to use a slightly more generic aspirin-like query to allow
> the
> > possibility of matching both 'normal' aspirin and embedded aromatic
> > analogues. CC(=O)Oc1ccccc1[#6](=O)[#8] should work OK.
> >
> > Regards,
> > Chris
> >
> > On 13 September 2017 at 13:40, Michał Nowotka <mmm...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> This problem is probably due to my lack of chemistry knowledge but
> >> plese have a look:
> >>
> >> If I do a substructure search in ChEMBL using aspirin (CHEMBL25) as a
> >> query (ChEMBL API uses the Symix catridge):
> >>
> >>     from chembl_webresource_client.new_client import new_client
> >>     res = new_client.substructure.filter(chembl_id='CHEMBL25')
> >>
> >> One of them will be CHEMBL1999443:
> >>
> >>     'CHEMBL1999443' in (r['molecule_chembl_id'] for r in res)
> >>     >>> True
> >>
> >> Now I take the molfile:
> >>
> >>     new_client.molecule.set_format('mol')
> >>     mol = new_client.molecule.get('CHEMBL1999443')
> >>
> >> and load it with aspirin into rdkit:
> >>
> >>     from rdkit import Chem
> >>     m = Chem.MolFromMolBlock(mol)
> >>     pattern = Chem.MolFromMolBlock(new_client.molecule.get('CHEMBL25'))
> >>
> >> If I check if it has an aspirin as a substructure using rdkit, I'm
> >> getting false...
> >>
> >>     m.HasSubstructMatch(pattern)
> >>     >>> False
> >>
> >> Looking at this blog post:
> >>
> >> https://github.com/rdkit/rdkit-tutorials/blob/master/
> notebooks/002_SMARTS_SubstructureMatching.ipynb
> >> I tried to initialize rings and retry:
> >>
> >>      Chem.GetSymmSSSR(m)
> >>      m.HasSubstructMatch(pattern)
> >>      >>>False
> >>
> >>     Chem.GetSymmSSSR(pattern)
> >>     m.HasSubstructMatch(pattern)
> >>     >>>False
> >>
> >> But as you can see without any luck. Is there anything else I can do
> >> to get the match anyway?
> >> Without having a match I can't aligh and higlight asprin substructure
> >> in CHEMBL1999443 image using GenerateDepictionMatching2DStructure and
> >> DrawMolecule functions.
> >>
> >> Kind regards,
> >>
> >> Michał Nowotka
> >>
> >>
> >> ------------------------------------------------------------
> ------------------
> >> Check out the vibrant tech community on one of the world's most
> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >> _______________________________________________
> >> Rdkit-discuss mailing list
> >> Rdkit-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
> >
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to