Hi there, is it normal to get, for the same set of smiles, more generic Murcko frameworks that normal frameworks? When running this:
### Generate framework for a SMILES, handling for errors def framecheck(s): try: return Chem.MolToSmiles(ms.GetScaffoldForMol(Chem.MolFromSmiles(s))) except: pass ### Generate generic framework for a SMILES, handling for errors def gframecheck(s): try: return Chem.MolToSmiles(ms.MakeScaffoldGeneric(Chem.MolFromSmiles(s))) except: pass # Count unique frameworks fraq = [framecheck(s) for s in smis] fraq = list(set(fraq)) len(fraq) # Count unique generic frameworks gfraq = [gframecheck(s) for s in smis] gfraq = list(set(gfraq)) len(gfraq) I get for the attached set of smiles 1431 frameworks and 2207 generic frameworks. The generic frameworks are supposed to have all atom types set to C and all bonds to single, so I would expect less generic frameworks. Is it necessary some sort of canonicalization? Thanks in advance Gonzalo
examp.smi
Description: Binary data
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss