Hi Jim,
Would it not be easier to use a recursive SMARTS, so that you only count
the carbon atoms? Something like [$([C,c]Cl)]-,=,:[$([C,c]Cl)], or, more
compactly [$([#6]Cl)]~[$([#6]Cl)].  I haven't tested these, as I'm not
close to a suitably equipped computer, but you should be able to get the
gist at least.  The Cl is only defining the sort of C you're after so you
won't have to deal with multiple Cl matches on the same atom.
Dave


On Wed, Nov 8, 2017 at 7:08 AM, Greg Landrum <greg.land...@gmail.com> wrote:

> Jim,
>
> I'm a bit confused by what you're trying to do.
>
> Maybe we can try simplifying. What would you like to have returned for
> each of these SMILES:
> 1) ClC=CCl
> 2) ClC(Cl)=CCl
> 3) ClC(Cl)=C(Cl)Cl
>
> If the answer is the same between 1) and 2), but different for 3), then
> the next question will be: "why?"
>
> -greg
>
>
> On Wed, Nov 8, 2017 at 12:38 AM, James T. Metz via Rdkit-discuss <
> rdkit-discuss@lists.sourceforge.net> wrote:
>
>> RDkit Discussion Group,
>>
>>     I have written a SMARTS to detect vicinal chlorine groups
>> using RDkit.  There are 4 atoms involved in a vicinal chlorine group.
>>
>> SMARTS = '[Cl]-[C,c]-,=,:[C,c]-[Cl]'
>>
>>     I am trying to count the number of ("unique") occurrences of this
>> pattern.
>>
>>     For some molecules with symmetry, this results in
>> over-counting.
>>
>>     For the molecule, smiles1 below, I want to obtain
>> a count of 1 i.e., 1 tuple of 4 atoms.
>>
>>     smiles1 = 'ClC(Cl)CCl'
>>
>>     However, using the SMARTS above, I obtain 2 tuples of 4 atoms.
>> Beginning with a MOL file representation of smiles1, I get
>>
>>     ((1,2,4,3), (0,2,4,3))
>>
>>     One possible solution is to somehow merge the two tuples according
>> to a "rule."  One rule that works is "if 3 of the atom indices are the
>> same,
>> then combine into one tuple."
>>
>>     However, the rule needs a bit of modification for more complicated
>> cases (higher symmetry).
>>
>>     Consider
>>
>>     smiles2 = 'ClC(Cl)CCl(Cl)(Cl)
>>
>>     My goal is to get 2 tuples of 4 atoms for smiles2
>>
>>     smiles2 is somewhat tricky because there are either
>> 2 groups of 3 (4 atom) tuples, or 3 groups of 2 (4 atom)
>> tuples depending on how you choose your 3 atom indices.
>>
>>     Again, if my goal is to get 2 tuples, then I need to somehow
>> pick the largest group, i.e., 2 groups of 3 tuples to do the merge
>> operation which will give me 2 remaining groups (desired).
>>
>>     I have already checked stackoverflow and a few other places
>> for PYTHON code to do the necessary merging, but I could not
>> find anything specific and appropriate.
>>
>>     I would be most grateful if anyone has ideas how to do this.  I
>> suspect the answer is a few lines of well-written PYTHON code,
>> and not modifying the SMARTS (I could be mistaken!).
>>
>>     Thank you.
>>
>>     Regards,
>>     Jim Metz
>>
>>
>>
>> ------------------------------------------------------------
>> ------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>


-- 
David Cosgrove
Freelance computational chemistry and chemoinformatics developer
http://cozchemix.co.uk
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to