Chris,
Thanks again. I thought of this idea also as I was eating my lunch!
One could create a "hyper SMARTS" using one or more vertical pipes
[C]-[C]|[N]-[N]
Then use PYTHON to check for the existence of a vertical pipe in the
string.
If it exists, then split the hyper SMARTS string using the vertical pipe
character.
Then process each of the separated SMARTS strings in a loop, generate the list
of atoms for each iteration of the loop (each SMARTS), and keep adding the
groups
of atoms (as you did in your example code below).
I think the vertical pipe is probably a good choice for a separator, since
it
is not used in normal SMARTS and hence would not be confused with some
regular SMARTS expression.
This is clearly not as elegant as I was hoping for, but should work just
fine.
Thanks again.
Regards,
Jim Met
-----Original Message-----
From: Chris Earnshaw <cgearns...@gmail.com>
To: James T. Metz <jamestm...@aol.com>
Cc: Rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists.sourceforge.net>
Sent: Tue, Sep 19, 2017 11:45 am
Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hmm, that makes it distinctly trickier. Recursive SMARTS work at the level of
single atoms and their environment, rather than treating a group of atoms
together, so I suspect it isn't possible to create a single SMARTS to give the
match information you need in one go. I'd be very pleased to find out I'm wrong!
If you could use a small script it's pretty trivial of course -
m = Chem.MolFromSmiles("CCNN")
p1 = Chem.MolFromSmarts("C-C")
p2 = Chem.MolFromSmarts("N-N")
m1 = m.GetSubstructMatches(p1)
m2 = m.GetSubstructMatches(p2)
m1 + m2
which gives ((0, 1), (2, 3)) as required, but if you have a specific need for
the 'single SMARTS' approach that's not much use. Sorry not to be more
helpful...
Chris Earnshaw
On 19 September 2017 at 16:50, James T. Metz <jamestm...@aol.com> wrote:
Chris,
Thank you for your interesting suggestion, but it is not quite what I need.
For example, consider the molecule
m = Chem.MolFromSmiles("CCNN")
I am looking for one SMARTS that using the SMARTS pattern matching
capability in RDkit would return 2 groups, each group containing the two
atoms corresponding to CC and NN.
Your suggested recursive SMARTS and code below
pattern = Chem.MolFromSmarts('[$(C-C),$(N-N)]')
match = m.GetSubstructMatches(pattern)
match
returns
((0,), (1,), (2,), (3,))
The output I am trying to achieve, instead, is
((0,1), (2,3))
Is there a single SMARTS that will do that?
Regards,
Jim Metz
-----Original Message-----
From: Chris Earnshaw <cgearns...@gmail.com>
To: James T. Metz <jamestm...@aol.com>
Cc: Rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists.sourceforge.net>
Sent: Tue, Sep 19, 2017 10:13 am
Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hi
Will the recursive SMARTS [$(C-C),$(N-N)] not do the job?
I'd parse this in English as 'an atom which is EITHER an aliphatic carbon
singly bonded to an aliphatic carbon OR an aliphatic nitrogen singly bonded to
an aliphatic nitrogen'.
Regards,
Chris Earnshaw
On 19 September 2017 at 15:01, James T. Metz via Rdkit-discuss
<rdkit-discuss@lists.sourceforge.net> wrote:
Dante,
Yes. In principle, if one can figure out all of the possible undesired
cross
matches.
Since my goal is to do this in RDkit and generate groups of atoms
that match, perhaps one approach is to simply use multiple RDkit pattern
matching statements (with multiple SMARTS), generate the groups of atoms,
then combine the lists, removing identical groups.
Hmmm... Is there a more straightforward (elegant) solution?
Regards,
Jim Metz
-----Original Message-----
From: Dante <dante.esgrimi...@gmail.com>
To: James T. Metz <jamestm...@aol.com>
Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net>
Sent: Tue, Sep 19, 2017 8:45 am
Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hi Jim,
Could you use the 'NOT' logical operator (!) in combination with recursive
SMARTS to eliminate the cross-matches?
Cheers,
Dante
On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss
<rdkit-discuss@lists.sourceforge.net> wrote:
Hello,
Is it possible to write a single SMARTS for two separate patterns involving
a Boolean OR?
For example, I want to write a single SMARTS that can match the
patterns of
[C]-[C]
or
[N]-[N]
I realize that I could write something like
[C,N]-[C,N]
but that would also match "cross" patterns such as
CN and NC which I don't want.
I have tried to write
([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or related
expressions (variations of parentheses, brackets, etc) to work.
Hence, if someone knows how to combine separate SMARTS expressions into
a single expression with a Boolean OR, I would be grateful. Thank you.
Regards,
Jim Metz
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss