On Aug 7, 2019, at 13:08, Paolo Tosco <paolo.tosco.m...@gmail.com> wrote: > You can use > > Chem.MolFragmentToSmiles(mol, match) > > where match is a tuple of atom indices returned by GetSubstructMatch().
Note however that if only the atom indices are given then Chem.MolFragmentToSmiles() will include all bonds which connect those atoms, even if the original SMARTS does not match those bonds. For example: >>> from rdkit import Chem >>> pat = Chem.MolFromSmarts("*~*~*~*") # match 4 linear atoms >>> mol = Chem.MolFromSmiles("C1CCC1") # ring of size 4 >>> atom_indices = mol.GetSubstructMatch(pat) >>> atom_indices (0, 1, 2, 3) >>> Chem.MolFragmentToSmiles(mol, atom_indices) # returns the ring 'C1CCC1' If this is important, then you need to pass the correct bond indices to MolFragmentToSmiles(). This can be done by using the bonds in the query graph to get the bond indices in the molecule graph. I believe the following is correct: def get_match_bond_indices(query, mol, match_atom_indices): bond_indices = [] for query_bond in query.GetBonds(): atom_index1 = match_atom_indices[query_bond.GetBeginAtomIdx()] atom_index2 = match_atom_indices[query_bond.GetEndAtomIdx()] bond_indices.append(mol.GetBondBetweenAtoms( atom_index1, atom_index2).GetIdx()) return bond_indices (Does a function like this already exist in RDKit?) I'll use it to get the bond indices for the *~*~*~* match: >>> bond_indices = get_match_bond_indices(pat, mol, atom_indices) >>> bond_indices [0, 1, 2] Passing the atom and bond indices gives the expected match SMILES: >>> Chem.MolFragmentToSmiles(mol, atom_indices, bond_indices) 'CCCC' Cheers, Andrew da...@dalkescientific.com _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss