I think you probably used a slightly different SMILES than the one you showed. The one you showed should have given ((0,1,3,4),(2,1,3,4)).

## Advertising

The proper merge rule would then be to consider all matches equivalent if the 2nd and 3rd atom in the match agree, in any order; i.e, the two carbons, indices 1 and 3 in this case. So to do this, for each molecule, do something like this: d = dict{} for match in matches: t = (match[1], match[2]) if match[1] < match[2] ): t = (match[1], match[2]) else: t = (match[2], match[1]) d[t] = match You will wind up with as many dictionary elements as there are matches. -P. On Tue, Nov 7, 2017 at 7:38 PM, James T. Metz via Rdkit-discuss < rdkit-discuss@lists.sourceforge.net> wrote: > RDkit Discussion Group, > > I have written a SMARTS to detect vicinal chlorine groups > using RDkit. There are 4 atoms involved in a vicinal chlorine group. > > SMARTS = '[Cl]-[C,c]-,=,:[C,c]-[Cl]' > > I am trying to count the number of ("unique") occurrences of this > pattern. > > For some molecules with symmetry, this results in > over-counting. > > For the molecule, smiles1 below, I want to obtain > a count of 1 i.e., 1 tuple of 4 atoms. > > smiles1 = 'ClC(Cl)CCl' > > However, using the SMARTS above, I obtain 2 tuples of 4 atoms. > Beginning with a MOL file representation of smiles1, I get > > ((1,2,4,3), (0,2,4,3)) > > One possible solution is to somehow merge the two tuples according > to a "rule." One rule that works is "if 3 of the atom indices are the > same, > then combine into one tuple." > > However, the rule needs a bit of modification for more complicated > cases (higher symmetry). > > Consider > > smiles2 = 'ClC(Cl)CCl(Cl)(Cl) > > My goal is to get 2 tuples of 4 atoms for smiles2 > > smiles2 is somewhat tricky because there are either > 2 groups of 3 (4 atom) tuples, or 3 groups of 2 (4 atom) > tuples depending on how you choose your 3 atom indices. > > Again, if my goal is to get 2 tuples, then I need to somehow > pick the largest group, i.e., 2 groups of 3 tuples to do the merge > operation which will give me 2 remaining groups (desired). > > I have already checked stackoverflow and a few other places > for PYTHON code to do the necessary merging, but I could not > find anything specific and appropriate. > > I would be most grateful if anyone has ideas how to do this. I > suspect the answer is a few lines of well-written PYTHON code, > and not modifying the SMARTS (I could be mistaken!). > > Thank you. > > Regards, > Jim Metz > > > > ------------------------------------------------------------ > ------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > >

------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss