Your course of action depends upon just what you are really trying to do.
If it's only aspirin, then why wouldn't you just do it manually? If it goes
beyond aspirin, you have to start by defining in general terms exactly what
you want to match to what.
For example, given a query molecule (aspirin in this case), if you want all
its non-aromatic atoms to match aromatic as well as non-aromatic atoms in
the database, you could write a string-alteration routine to munge the
SMILES of a query molecule into a SMARTS that would do just that, and then
use that SMARTS to match your database molecules. Repeat for each query
molecule.
But you have to start with a precise definition of just what kind of
matching you wish to do. For instance, maybe you don't really want
non-aromatic ring atoms in your query to match aromatic rings and vice
versa (i.e., a cyclohexyl to match a phenyl); maybe you only want non-ring
atoms in the query to match aliphatic as well as aromatic substructures.
And so on.
-P.
On Wed, Sep 13, 2017 at 10:42 AM, Michał Nowotka <mmm...@gmail.com> wrote:
> Is there any flag in RDkit to match both 'normal' aspirin and embedded
> aromatic analogues?
> The problem is that I can't modify user queries by hand in real time :)
>
> On Wed, Sep 13, 2017 at 2:12 PM, Chris Earnshaw <cgearns...@gmail.com>
> wrote:
> > Hi
> >
> > The problem is due to RDkit perceiving the embedded pyranone in
> > CHEMBL1999443 as an aromatic system, which is probably correct. However,
> in
> > the structure of aspirin the carboxyl carbon and singly bonded oxygen are
> > non-aromatic, so if you just use the SMILES of aspirin as a query it
> won't
> > match CHEMBL1999443
> >
> > You'll need to use a slightly more generic aspirin-like query to allow
> the
> > possibility of matching both 'normal' aspirin and embedded aromatic
> > analogues. CC(=O)Oc1ccccc1[#6](=O)[#8] should work OK.
> >
> > Regards,
> > Chris
> >
> > On 13 September 2017 at 13:40, Michał Nowotka <mmm...@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> This problem is probably due to my lack of chemistry knowledge but
> >> plese have a look:
> >>
> >> If I do a substructure search in ChEMBL using aspirin (CHEMBL25) as a
> >> query (ChEMBL API uses the Symix catridge):
> >>
> >> from chembl_webresource_client.new_client import new_client
> >> res = new_client.substructure.filter(chembl_id='CHEMBL25')
> >>
> >> One of them will be CHEMBL1999443:
> >>
> >> 'CHEMBL1999443' in (r['molecule_chembl_id'] for r in res)
> >> >>> True
> >>
> >> Now I take the molfile:
> >>
> >> new_client.molecule.set_format('mol')
> >> mol = new_client.molecule.get('CHEMBL1999443')
> >>
> >> and load it with aspirin into rdkit:
> >>
> >> from rdkit import Chem
> >> m = Chem.MolFromMolBlock(mol)
> >> pattern = Chem.MolFromMolBlock(new_client.molecule.get('CHEMBL25'))
> >>
> >> If I check if it has an aspirin as a substructure using rdkit, I'm
> >> getting false...
> >>
> >> m.HasSubstructMatch(pattern)
> >> >>> False
> >>
> >> Looking at this blog post:
> >>
> >> https://github.com/rdkit/rdkit-tutorials/blob/master/
> notebooks/002_SMARTS_SubstructureMatching.ipynb
> >> I tried to initialize rings and retry:
> >>
> >> Chem.GetSymmSSSR(m)
> >> m.HasSubstructMatch(pattern)
> >> >>>False
> >>
> >> Chem.GetSymmSSSR(pattern)
> >> m.HasSubstructMatch(pattern)
> >> >>>False
> >>
> >> But as you can see without any luck. Is there anything else I can do
> >> to get the match anyway?
> >> Without having a match I can't aligh and higlight asprin substructure
> >> in CHEMBL1999443 image using GenerateDepictionMatching2DStructure and
> >> DrawMolecule functions.
> >>
> >> Kind regards,
> >>
> >> Michał Nowotka
> >>
> >>
> >> ------------------------------------------------------------
> ------------------
> >> Check out the vibrant tech community on one of the world's most
> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >> _______________________________________________
> >> Rdkit-discuss mailing list
> >> Rdkit-discuss@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
> >
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss