Dear RDKit-users, I would like to do a very simple substructure search. The chapter 3.5 "Substructure Searching" in RDKit Documentation (2019.09.1) is pretty short and doesn't point to a solution. So far, I've learned that you can create your search pattern via Chem.MolFromSmiles() or Chem.MolFromSmarts().
In the below copy&paste minimal example, I want to use the first SMILES in the list as search pattern. I expect 2 matches but I get either 1 or 0 matches. So, I'm doing something wrong. What am I missing? Is it about implicit/explicit aromatic and aliphatic bonds or some explicit/implicit hydrogen? How to find the first structure in both SMILES? thank you for any hints, Theo. ### simple substructure search (but doesn't find what is expected) from rdkit import Chem smiles_strings = ''' C12=CC=CN1NCCC2 C12=CC=CC(C=C3)=C1N3NCC2 ''' smiles_list = smiles_strings.splitlines()[1:] print(smiles_list) pattern = Chem.MolFromSmiles(smiles_list[0]) # MolFromSmiles matches = [x for x in smiles_list if Chem.MolFromSmiles(x).HasSubstructMatch(pattern)] print(len(matches)) # result: 1, why not 2? pattern = Chem.MolFromSmarts(smiles_list[0]) # MolFromSmarts matches = [x for x in smiles_list if Chem.MolFromSmiles(x).HasSubstructMatch(pattern)] print(len(matches)) # result: 0, why not 2? ### end of code _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss