Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hmm, that makes it distinctly trickier. Recursive SMARTS work at the level of single atoms and their environment, rather than treating a group of atoms together, so I suspect it isn't possible to create a single SMARTS to give the match information you need in one go. I'd be very pleased to find out I'm wrong! If you could use a small script it's pretty trivial of course - m = Chem.MolFromSmiles("CCNN") p1 = Chem.MolFromSmarts("C-C") p2 = Chem.MolFromSmarts("N-N") m1 = m.GetSubstructMatches(p1) m2 = m.GetSubstructMatches(p2) m1 + m2 which gives ((0, 1), (2, 3)) as required, but if you have a specific need for the 'single SMARTS' approach that's not much use. Sorry not to be more helpful... Chris Earnshaw On 19 September 2017 at 16:50, James T. Metz <jamestm...@aol.com> wrote: > Chris, > > Thank you for your interesting suggestion, but it is not quite what I > need. > > For example, consider the molecule > > m = Chem.MolFromSmiles("CCNN") > > I am looking for one SMARTS that using the SMARTS pattern matching > capability in RDkit would return 2 groups, each group containing the two > atoms corresponding to CC and NN. > > Your suggested recursive SMARTS and code below > > pattern = Chem.MolFromSmarts('[$(C-C),$(N-N)]') > match = m.GetSubstructMatches(pattern) > match > > returns > > ((0,), (1,), (2,), (3,)) > > > The output I am trying to achieve, instead, is > > > ((0,1), (2,3)) > > > Is there a single SMARTS that will do that? > > > Regards, > > Jim Metz > > > > > > > -Original Message- > From: Chris Earnshaw <cgearns...@gmail.com> > To: James T. Metz <jamestm...@aol.com> > Cc: Rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists. > sourceforge.net> > Sent: Tue, Sep 19, 2017 10:13 am > Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR > > Hi > > Will the recursive SMARTS [$(C-C),$(N-N)] not do the job? > > I'd parse this in English as 'an atom which is EITHER an aliphatic carbon > singly bonded to an aliphatic carbon OR an aliphatic nitrogen singly bonded > to an aliphatic nitrogen'. > > Regards, > Chris Earnshaw > > On 19 September 2017 at 15:01, James T. Metz via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> wrote: > > Dante, > > Yes. In principle, if one can figure out all of the possible > undesired cross > matches. > > Since my goal is to do this in RDkit and generate groups of atoms > that match, perhaps one approach is to simply use multiple RDkit pattern > matching statements (with multiple SMARTS), generate the groups of atoms, > then combine the lists, removing identical groups. > > Hmmm... Is there a more straightforward (elegant) solution? > > Regards, > Jim Metz > > > > -Original Message- > From: Dante <dante.esgrimi...@gmail.com> > To: James T. Metz <jamestm...@aol.com> > Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> > Sent: Tue, Sep 19, 2017 8:45 am > Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR > > Hi Jim, > > Could you use the 'NOT' logical operator (!) in combination with recursive > SMARTS to eliminate the cross-matches? > > Cheers, > > Dante > > On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> wrote: > > Hello, > > Is it possible to write a single SMARTS for two separate patterns involving > a Boolean OR? > > For example, I want to write a single SMARTS that can match the > patterns of > > [C]-[C] > > or > > [N]-[N] > > I realize that I could write something like > > [C,N]-[C,N] > > but that would also match "cross" patterns such as > > CN and NC which I don't want. > > I have tried to write > > ([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or > related > expressions (variations of parentheses, brackets, etc) to work. > > Hence, if someone knows how to combine separate SMARTS expressions into > a single expression with a Boolean OR, I would be grateful. Thank you. > > Regards, > Jim Metz > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > ---
Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Chris, Thank you for your interesting suggestion, but it is not quite what I need. For example, consider the molecule m = Chem.MolFromSmiles("CCNN") I am looking for one SMARTS that using the SMARTS pattern matching capability in RDkit would return 2 groups, each group containing the two atoms corresponding to CC and NN. Your suggested recursive SMARTS and code below pattern = Chem.MolFromSmarts('[$(C-C),$(N-N)]') match = m.GetSubstructMatches(pattern) match returns ((0,), (1,), (2,), (3,)) The output I am trying to achieve, instead, is ((0,1), (2,3)) Is there a single SMARTS that will do that? Regards, Jim Metz -Original Message- From: Chris Earnshaw <cgearns...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: Rdkit-discuss@lists.sourceforge.net <rdkit-discuss@lists.sourceforge.net> Sent: Tue, Sep 19, 2017 10:13 am Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR Hi Will the recursive SMARTS [$(C-C),$(N-N)] not do the job? I'd parse this in English as 'an atom which is EITHER an aliphatic carbon singly bonded to an aliphatic carbon OR an aliphatic nitrogen singly bonded to an aliphatic nitrogen'. Regards, Chris Earnshaw On 19 September 2017 at 15:01, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Dante, Yes. In principle, if one can figure out all of the possible undesired cross matches. Since my goal is to do this in RDkit and generate groups of atoms that match, perhaps one approach is to simply use multiple RDkit pattern matching statements (with multiple SMARTS), generate the groups of atoms, then combine the lists, removing identical groups. Hmmm... Is there a more straightforward (elegant) solution? Regards, Jim Metz -Original Message- From: Dante <dante.esgrimi...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Tue, Sep 19, 2017 8:45 am Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR Hi Jim, Could you use the 'NOT' logical operator (!) in combination with recursive SMARTS to eliminate the cross-matches? Cheers, Dante On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Hello, Is it possible to write a single SMARTS for two separate patterns involving a Boolean OR? For example, I want to write a single SMARTS that can match the patterns of [C]-[C] or [N]-[N] I realize that I could write something like [C,N]-[C,N] but that would also match "cross" patterns such as CN and NC which I don't want. I have tried to write ([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or related expressions (variations of parentheses, brackets, etc) to work. Hence, if someone knows how to combine separate SMARTS expressions into a single expression with a Boolean OR, I would be grateful. Thank you. Regards, Jim Metz -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hi Will the recursive SMARTS [$(C-C),$(N-N)] not do the job? I'd parse this in English as 'an atom which is EITHER an aliphatic carbon singly bonded to an aliphatic carbon OR an aliphatic nitrogen singly bonded to an aliphatic nitrogen'. Regards, Chris Earnshaw On 19 September 2017 at 15:01, James T. Metz via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Dante, > > Yes. In principle, if one can figure out all of the possible > undesired cross > matches. > > Since my goal is to do this in RDkit and generate groups of atoms > that match, perhaps one approach is to simply use multiple RDkit pattern > matching statements (with multiple SMARTS), generate the groups of atoms, > then combine the lists, removing identical groups. > > Hmmm... Is there a more straightforward (elegant) solution? > > Regards, > Jim Metz > > > > -Original Message- > From: Dante <dante.esgrimi...@gmail.com> > To: James T. Metz <jamestm...@aol.com> > Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> > Sent: Tue, Sep 19, 2017 8:45 am > Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR > > Hi Jim, > > Could you use the 'NOT' logical operator (!) in combination with recursive > SMARTS to eliminate the cross-matches? > > Cheers, > > Dante > > On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss < > rdkit-discuss@lists.sourceforge.net> wrote: > > Hello, > > Is it possible to write a single SMARTS for two separate patterns involving > a Boolean OR? > > For example, I want to write a single SMARTS that can match the > patterns of > > [C]-[C] > > or > > [N]-[N] > > I realize that I could write something like > > [C,N]-[C,N] > > but that would also match "cross" patterns such as > > CN and NC which I don't want. > > I have tried to write > > ([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or > related > expressions (variations of parentheses, brackets, etc) to work. > > Hence, if someone knows how to combine separate SMARTS expressions into > a single expression with a Boolean OR, I would be grateful. Thank you. > > Regards, > Jim Metz > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Dante, Yes. In principle, if one can figure out all of the possible undesired cross matches. Since my goal is to do this in RDkit and generate groups of atoms that match, perhaps one approach is to simply use multiple RDkit pattern matching statements (with multiple SMARTS), generate the groups of atoms, then combine the lists, removing identical groups. Hmmm... Is there a more straightforward (elegant) solution? Regards, Jim Metz -Original Message- From: Dante <dante.esgrimi...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Tue, Sep 19, 2017 8:45 am Subject: Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR Hi Jim, Could you use the 'NOT' logical operator (!) in combination with recursive SMARTS to eliminate the cross-matches? Cheers, Dante On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss <rdkit-discuss@lists.sourceforge.net> wrote: Hello, Is it possible to write a single SMARTS for two separate patterns involving a Boolean OR? For example, I want to write a single SMARTS that can match the patterns of [C]-[C] or [N]-[N] I realize that I could write something like [C,N]-[C,N] but that would also match "cross" patterns such as CN and NC which I don't want. I have tried to write ([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or related expressions (variations of parentheses, brackets, etc) to work. Hence, if someone knows how to combine separate SMARTS expressions into a single expression with a Boolean OR, I would be grateful. Thank you. Regards, Jim Metz -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] single SMARTS for two patterns with Boolean OR
Hi Jim, Could you use the 'NOT' logical operator (!) in combination with recursive SMARTS to eliminate the cross-matches? Cheers, Dante On Tue, Sep 19, 2017 at 9:13 AM, James T. Metz via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > Hello, > > Is it possible to write a single SMARTS for two separate patterns involving > a Boolean OR? > > For example, I want to write a single SMARTS that can match the > patterns of > > [C]-[C] > > or > > [N]-[N] > > I realize that I could write something like > > [C,N]-[C,N] > > but that would also match "cross" patterns such as > > CN and NC which I don't want. > > I have tried to write > > ([C]-[C]), ([N\-[N]) but I have not been able to get that syntax or > related > expressions (variations of parentheses, brackets, etc) to work. > > Hence, if someone knows how to combine separate SMARTS expressions into > a single expression with a Boolean OR, I would be grateful. Thank you. > > Regards, > Jim Metz > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss