Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-08 Thread James T. Metz via Rdkit-discuss
Regards, Jim Metz -Original Message- From: Brian Cole <col...@gmail.com> To: James T. Metz <jamestm...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Tue, Nov 7, 2017 7:23 pm Subject: Re: [Rdkit-discuss] Python code to merge tup

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-08 Thread James T. Metz via Rdkit-discuss
...@aol.com> Cc: RDKit Discuss <rdkit-discuss@lists.sourceforge.net> Sent: Tue, Nov 7, 2017 7:05 pm Subject: Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match I think you probably used a slightly different SMILES than the one you showed. The one you showed should have give

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread David Cosgrove
Hi Jim, Would it not be easier to use a recursive SMARTS, so that you only count the carbon atoms? Something like [$([C,c]Cl)]-,=,:[$([C,c]Cl)], or, more compactly [$([#6]Cl)]~[$([#6]Cl)]. I haven't tested these, as I'm not close to a suitably equipped computer, but you should be able to get the

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread Greg Landrum
Jim, I'm a bit confused by what you're trying to do. Maybe we can try simplifying. What would you like to have returned for each of these SMILES: 1) ClC=CCl 2) ClC(Cl)=CCl 3) ClC(Cl)=C(Cl)Cl If the answer is the same between 1) and 2), but different for 3), then the next question will be:

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread Brian Cole
You can use Chem.CanonicalRankAtoms to de-duplicate the SMARTS matches based upon the atom symmetry like this: def count_unique_substructures(smiles, smarts): mol = Chem.MolFromSmiles(smiles) ranks = list(Chem.CanonicalRankAtoms(mol, breakTies=False)) pattern =

Re: [Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread Peter S. Shenkin
I think you probably used a slightly different SMILES than the one you showed. The one you showed should have given ((0,1,3,4),(2,1,3,4)). The proper merge rule would then be to consider all matches equivalent if the 2nd and 3rd atom in the match agree, in any order; i.e, the two carbons, indices

[Rdkit-discuss] Python code to merge tuples from a SMARTS match

2017-11-07 Thread James T. Metz via Rdkit-discuss
RDkit Discussion Group, I have written a SMARTS to detect vicinal chlorine groups using RDkit. There are 4 atoms involved in a vicinal chlorine group. SMARTS = '[Cl]-[C,c]-,=,:[C,c]-[Cl]' I am trying to count the number of ("unique") occurrences of this pattern. For some