On Jul 23, 2021, at 06:42, Andrew Dalke <da...@dalkescientific.com> wrote: > > No, there's no way to do that. > > The best I can suggest is to go back to the original Python implementation > and change the code leading up to
Alternatively, since your template is small, you can brute-force enumerate all possible matching SMARTS patterns, and test them from largest to smallest. I believe the following patterns are correct for your template. These are ordered by number of bonds, then number of atoms, then ASCII-betically. (Note: these many contain duplicates because Chem.MolToSmarts doesn't produce canonical SMARTS.) [n,c,o]1(-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1 [n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o]1(-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1 [n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O):[n,c,o]:[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]:[n,c,o]):[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O):[n,c,o]:[n,c,o] [n,c,o](-S(-*)(=O)=O)(:[n,c,o]):[n,c,o] [n,c,o](-S(=O)=O):[n,c,o]:[n,c,o] [n,c,o](-S(=O)=O)(:[n,c,o]):[n,c,o] [n,c,o](-S(-*)(=O)=O):[n,c,o] [n,c,o]-S(-*)(=O)=O [n,c,o](-S(=O)=O):[n,c,o] [n,c,o]-S(=O)=O S(-*)(=O)=O S(=O)=O I generated it with the following: === from rdkit import Chem import itertools # Must have the atoms marked with an atom map (the atom map value is ignored). template = '[n,c,o]1(-[S:1](-*)(=[O:1])=[O:1]):[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:[n,c,o]:1' mol = Chem.MolFromSmarts(template) # Figure out which bonds to keep bond_atom_indices = [] for bond in mol.GetBonds(): if all(atom.HasProp("molAtomMapNumber") for atom in (bond.GetBeginAtom(), bond.GetEndAtom())): continue bond_atom_indices.append((bond.GetBeginAtomIdx(), bond.GetEndAtomIdx())) # Remove the atom maps for atom in mol.GetAtoms(): if atom.HasProp("molAtomMapNumber"): atom.ClearProp("molAtomMapNumber") seen = set() # Enumerate all possible bonds to delete (should be 2**n) for r in range(0, len(bond_atom_indices)+1): for delete_indices in itertools.combinations(bond_atom_indices, r): tmp_mol = Chem.RWMol(mol) # Remove the selected bonds for atom1_idx, atom2_idx in delete_indices: tmp_mol.RemoveBond(atom1_idx, atom2_idx) # Remove any singletons. Start from the end so the indices are stable. for atom in list(tmp_mol.GetAtoms())[::-1]: if not atom.GetBonds(): tmp_mol.RemoveAtom(atom.GetIdx()) # Get the corresponding SMARTS tmp_smarts = Chem.MolToSmarts(tmp_mol) # Ensure it's singly connected if "." in tmp_smarts: continue # Ensure it's unique; track the number of bonds and atoms for later sorting key = (tmp_mol.GetNumBonds(), tmp_mol.GetNumAtoms(), tmp_smarts) seen.add(key) for num_bonds, num_atoms, smarts in sorted(seen, reverse=True): print(smarts) === Andrew da...@dalkescientific.com _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss