Hi Chris,
Sure they're equivalent, but with my suggestion you don't have to create
all 6 different SMARTS patterns, which whilst not difficult is likely to be
prone to silly errors.  You can stick a long list of OR'd vector bindings
together to put in all the exclusions you want on each atom as you think of
them.
Dave


On Sun, Sep 24, 2017 at 5:15 PM, Chris Earnshaw <cgearns...@gmail.com>
wrote:

> Hi
>
> It amounts to the same thing - either do all tests on one atom, or one
> test on all atoms.
>
> The syntax is shorter for the latter if you can use the vector bindings
> but may not be otherwise, especially if multiple exclusions are needed.
>
> Regards,
> Chris Earnshaw
>
>
>
> On 24 Sep 2017 16:54, "David Cosgrove" <davidacosgrov...@gmail.com> wrote:
>
> Hi,
> I think Chris' solution is a bit overly complicated, though I haven't
> tested my alternative.  If each atom in the ring is tested for
> '[$(a);!$(n1(C)ccc(=O)nc1=O)]', as you'd get if you expanded out the
> vector bindings I provided previously, then I don't think you need to
> provide the SMARTS for the excluded ring starting from each atom.  So long
> as 1 of the atoms in the ring fails the test, the whole ring fails, so you
> just need the same test on each atom.
> Dave
>
>
> On Sun, Sep 24, 2017 at 4:45 PM, Chris Earnshaw <cgearns...@gmail.com>
> wrote:
>
>> Hi Jim
>>
>> The key thing to remember about the recursive SMARTS clauses is that
>> they only match one atom (the first), and the rest of the string
>> describes the environment in which that atom is located. So the clause
>> $(n1(C)ccc(=O)nc1=O) matches just the nitrogen atom - which has
>> embedded in the rest of the ring system. We then negate that with the
>> ! symbol.
>>
>> If we use just the recursive SMARTS expression '[$(a)]' (or the simple
>> SMARTS 'a'), it can match any of the six aromatic atoms in the
>> heterocycle. Adding the first exclusion '[$(a);!$(n1(C)ccc(=O)nc1=O)]'
>> means this atom can't match the nitrogen substituted by aliphatic
>> C,but it can still match any of the other five aromatic atoms.
>> Consequently there are five more exclusion clauses to add, each of
>> which starts with a different one of the aromatic atoms in your
>> undesired structure. As long as one of the atoms in the full SMARTS is
>> prevented from matching any of the atoms in the undesired structure in
>> this way, then the overall match is prevented.
>>
>> Adding an exclusion for pyridine is then easy. We're already excluding
>> six patterns, and (considering symmetry) we only need to add four more
>> to exclude all pyridines. Appending
>> ';!$(n1ccccc1);!$(c1ncccc1);!$(c1cnccc1);!$(c1ccncc1)' inside the
>> square brackets should do the trick.
>>
>> You're quite right though, this gets pretty cumbersome very quickly
>> and it may well be best to handle it in code with simple include /
>> exclude SMARTS patterns. You'll have to think about checking which
>> atoms have been matched - for example, do you want to match quinoline
>> because it contains a benzene ring, or exclude it because it contains
>> a pyridine? If the former you'll have to check that the atoms matched
>> by your two patterns are different.
>>
>> Hope this helps!
>>
>> Chris Earnshaw
>>
>> On 24 September 2017 at 15:01, James T. Metz <jamestm...@aol.com> wrote:
>> > Chris,
>> >
>> > Wow! Your recursive SMARTS expression works as needed!
>> >
>> > Hmmm... Help me understand this better ... it looks like you "walk
>> around"
>> > the
>> > ring of the substructure we want to exclude and employ a slightly
>> different
>> > recursive SMARTS beginning at that atom.  Is that correct?
>> >
>> > Also, since my situation is likely to get more complicated with respect
>> to
>> > exclusions, suppose I still wanted to utilize the general aromatic
>> > expression
>> > for a 6-membered ring i.e. [a]1:[a]:[a]:[a][a]:[a]1, and I wanted to
>> exclude
>> > the structures we have been discussing, and I also wanted to exclude
>> > pyridine i.e., [n]1:[c]:[c]:[c]:[c]:[c]1.
>> >
>> > Is there a SMARTS expression that would capture 2 exclusions?
>> >
>> > Perhaps this is getting too clumsy!  It might be better to have one or
>> more
>> > inclusion SMARTS and one or more exclusion SMARTS, and write code
>> > to remove those groups of atoms that are coming from the exclusion
>> SMARTS.
>> >
>> > Any ideas for PYTHON/RDkit code?  Something like
>> >
>> > test_smiles = 'c1ccccc1'
>> > inclusion_pattern = '[a]1:[a]:[a]:[a]:[a]:[a]1'
>> > exclusion_pattern = '[n]1:[c]:[c]:[c]:[c]:[c]1'
>> > etc...
>> >
>> > Hmmm... any other ideas, suggestions, comments?
>> >
>> > Thanks again.
>> >
>> > Regards,
>> > Jim Metz
>> >
>> >
>> >
>> >
>> > -----Original Message-----
>> > From: Chris Earnshaw <cgearns...@gmail.com>
>> > To: James T. Metz <jamestm...@aol.com>
>> > Cc: Rdkit-discuss@lists.sourceforge.net
>> > <rdkit-discuss@lists.sourceforge.net>
>> > Sent: Sun, Sep 24, 2017 4:01 am
>> > Subject: Re: [Rdkit-discuss] need SMARTS query with a specific exclusion
>> >
>> > Hi Jim
>> >
>> > It can be done with recursive SMARTS, though the syntax is a bit
>> > painful This may do what you want -
>> > [$(a);!$(n1(C)ccc(=O)nc1=O);!$(c1cc(=O)nc(=O)n1C);!$(c1c(=O)
>> nc(=O)n(C)c1);!$(c(=O)1nc(=O)n(C)cc1);!$(n1c(=O)n(C)ccc1=O);
>> !$(c(=O)1n(C)ccc(=O)n1)]:1:a:a:a:a:a:1
>> >
>> > Its basically the general 6-ring aromatic pattern a:1:a:a:a:a:a:1,
>> > with recursive SMARTS applied to the first atom to ensure that this
>> > can't match any of the 6 ring atoms in your undesired system.
>> >
>> > Regards,
>> > Chris Earnshaw
>> >
>> > On 24 September 2017 at 05:04, James T. Metz via Rdkit-discuss
>> > <rdkit-discuss@lists.sourceforge.net> wrote:
>> >> Hello,
>> >>
>> >> Suppose I have the following molecule
>> >>
>> >> m = 'CN1C=CC(=O)NC1=O'
>> >>
>> >> I would like to be able to use a SMARTS pattern
>> >>
>> >> pattern = '[a]1:[a][a]:[a]:[a]:a]1'
>> >>
>> >> to recognize the 6 atoms in a typical aromatic ring, but
>> >> I do not want to recognize the 6 atoms in the molecule,
>> >> m, as aromatic. In other words, I am trying to write
>> >> a specific exclusion.
>> >>
>> >> Is it possible to modify the SMARTS pattern to
>> >> exclude the above molecule? I have tried using
>> >> recursive SMARTS, but I can't get the syntax to
>> >> work.
>> >>
>> >> Any ideas? Thank you.
>> >>
>> >> Regards,
>> >> Jim Metz
>> >>
>> >>
>> >>
>> >>
>> >> ------------------------------------------------------------
>> ------------------
>> >> Check out the vibrant tech community on one of the world's most
>> >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> >> _______________________________________________
>> >> Rdkit-discuss mailing list
>> >> Rdkit-discuss@lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>> >>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
>
>
> --
> David Cosgrove
> Freelance computational chemistry and chemoinformatics developer
> http://cozchemix.co.uk
>
>
>


-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to