Hi Jim,
As a slight aside, this sort of thing demonstrates the value of what
Daylight used to call vector bindings (
http://www.daylight.com/dayhtml/doc/prog/prog.smarts.html#9.3) and which
one might these days call a macro.  For example, in the Daylight toolkit
you could bind the label HAL to [F,Cl,I,Br] and then write
c1([$HAL])c([$HAL])c([$HAL])c([$HAL])cc1 for a benzene with 4 halogen
substituents.  Not only is it clearer, but there's less typing.  Using such
a system for your query could go something like AR =
$(a);!$(n1(C)ccc(=O)nc1=O), followed by [$AR]1[$AR][$AR][$AR][$AR][$AR]1.
They could be nested, to, so that in the first example you could have
CHAL=$(c[$HAL]) and [$CHAL]1[$CHAL].... It's relatively simple to write a
general function that just does an iterative string substitution of all
labels into the corresponding SMARTS pattern to reproduce the spirit of the
Daylight vector bindings.  They also used them for efficiency at the search
stage as well, but taking advantage of that would require changes to the
SMARTS parsing and searching code.
At the hackathon I started putting together exactly this sort of function
as part of a tautomer enumerator but had to leave to catch my plane before
I finished.  If I manage to finish it in the next few days I'll post it
here.
Cheers,
Dave

On Sun, Sep 24, 2017 at 3:01 PM, James T. Metz via Rdkit-discuss <
rdkit-discuss@lists.sourceforge.net> wrote:

> Chris,
>
> Wow! Your recursive SMARTS expression works as needed!
>
> Hmmm... Help me understand this better ... it looks like you "walk around"
> the
> ring of the substructure we want to exclude and employ a slightly
> different
> recursive SMARTS beginning at that atom.  Is that correct?
>
> Also, since my situation is likely to get more complicated with respect to
> exclusions, suppose I still wanted to utilize the general aromatic
> expression
> for a 6-membered ring i.e. [a]1:[a]:[a]:[a][a]:[a]1, and I wanted to
> exclude
> the structures we have been discussing, and I also wanted to exclude
> pyridine i.e., [n]1:[c]:[c]:[c]:[c]:[c]1.
>
> Is there a SMARTS expression that would capture 2 exclusions?
>
> Perhaps this is getting too clumsy!  It might be better to have one or more
> inclusion SMARTS and one or more exclusion SMARTS, and write code
> to remove those groups of atoms that are coming from the exclusion SMARTS.
>
> Any ideas for PYTHON/RDkit code?  Something like
>
> test_smiles = 'c1ccccc1'
> inclusion_pattern = '[a]1:[a]:[a]:[a]:[a]:[a]1'
> exclusion_pattern = '[n]1:[c]:[c]:[c]:[c]:[c]1'
> etc...
>
> Hmmm... any other ideas, suggestions, comments?
>
> Thanks again.
>
> Regards,
> Jim Metz
>
>
>
>
> -----Original Message-----
> From: Chris Earnshaw <cgearns...@gmail.com>
> To: James T. Metz <jamestm...@aol.com>
> Cc: Rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists.
> sourceforge.net>
> Sent: Sun, Sep 24, 2017 4:01 am
> Subject: Re: [Rdkit-discuss] need SMARTS query with a specific exclusion
>
> Hi Jim
>
> It can be done with recursive SMARTS, though the syntax is a bit
> painful This may do what you want -
> [$(a);!$(n1(C)ccc(=O)nc1=O);!$(c1cc(=O)nc(=O)n1C);!$(c1c(=O)
> nc(=O)n(C)c1);!$(c(=O)1nc(=O)n(C)cc1);!$(n1c(=O)n(C)ccc1=O)
> ;!$(c(=O)1n(C)ccc(=O)n1)]:1:a:a:a:a:a:1
>
> Its basically the general 6-ring aromatic pattern a:1:a:a:a:a:a:1,
> with recursive SMARTS applied to the first atom to ensure that this
> can't match any of the 6 ring atoms in your undesired system.
>
> Regards,
> Chris Earnshaw
>
> On 24 September 2017 at 05:04, James T. Metz via Rdkit-discuss
> <rdkit-discuss@lists.sourceforge.net> wrote:
> > Hello,
> >
> > Suppose I have the following molecule
> >
> > m = 'CN1C=CC(=O)NC1=O'
> >
> > I would like to be able to use a SMARTS pattern
> >
> > pattern = '[a]1:[a][a]:[a]:[a]:a]1'
> >
> > to recognize the 6 atoms in a typical aromatic ring, but
> > I do not want to recognize the 6 atoms in the molecule,
> > m, as aromatic. In other words, I am trying to write
> > a specific exclusion.
> >
> > Is it possible to modify the SMARTS pattern to
> > exclude the above molecule? I have tried using
> > recursive SMARTS, but I can't get the syntax to
> > work.
> >
> > Any ideas? Thank you.
> >
> > Regards,
> > Jim Metz
> >
> >
> >
> > ------------------------------------------------------------
> ------------------
> > Check out the vibrant tech community on one of the world's most
> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> > _______________________________________________
> > Rdkit-discuss mailing list
> > Rdkit-discuss@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
> >
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>


-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to