Hello, I have a baffling case where I am trying to match substructures on two ligands for the goal of aligning them.
I have two ligands; one is a 6-chloroindole (6CI) and the other is a para-chloro toluene (PCT). I am attempting to use the following SMARTS (S1) to match them: '[C,c]1(Cl)[C,c][C,c]*([N,n,H])*[C,c]([C,c,H])[C,c]([H])[C,c]1'. For some reason S1 only finds a match in 6CI. When I use the following SMARTS (S2) I only match to PCT as expected: '[C,c]1(Cl)[C,c][C,c]*([H])*[C,c]([C,c,H])[C,c]([H])[C,c]1'. How can S1 not match PCT? S1 is strictly a superset of S2 because I am using the "or" operation. Do I have a misunderstanding of how explicit hydrogens work in RDKit/SMARTS? Lastly when I use the last SMARTS (S3) I am able to match to both, but I cannot use that smarts due to other requirements in my project: '[C,c]1(Cl)[C,c][C,c][C,c]([C,c,H])[C,c]([H])[C,c]1' Thanks! Adam
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss