Hi, I’m using the new R group decomposition code in version 2021.03.1 and I’m receiving some strange cores when I have R groups within rings. Here is my example code:
from rdkit import Chem from rdkit.Chem import rdRGroupDecomposition as rdRGD core = "O=C(Cc1cccc(Cl)c1)Nc1cncc2c[*:1]:[*:2]cc12” compounds = ["c1cc(cc(c1)Cl)CC(=O)Nc2cncc3c2ccnc3”, "c1cc(c(cc1CC(=O)Nc2cncc3c2ccnc3)Cl)Cl”, "c1cc(cc(c1)Cl)CC(=O)Nc2cncc3c2cncc3”] core_smarts = Chem.MolFromSmarts(core) compound_mols = [Chem.MolFromSmiles(x) for x in compounds] R_opts = rdRGD.RGroupDecompositionParameters() R_opts.allowNonTerminalRGroups = True R_opts.removeAllHydrogenRGroupsAndLabels = False Rgroups, remainder = rdRGD.RGroupDecompose(core_smarts, compound_mols, asSmiles=True, asRows=False, options=R_opts) print(Rgroups) Which produces this output: > {'Core': ['O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2cn([*:1])c([*:2])cc12', > 'O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2cn([*:1])c([*:2])cc12', > 'O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2cc([*:1])n([*:2])cc12'], 'R3': ['[H][*:3]', > 'Cl[*:3]', '[H][*:3]']} When I roll back to version 2020.03.3, I receive a more intuitive output: > {'Core': ['O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2c[*:1]:[*:2]cc12', > 'O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2c[*:1]:[*:2]cc12', > 'O=C(Cc1ccc([*:3])c(Cl)c1)Nc1cncc2c[*:1]:[*:2]cc12'], 'R1': > ['c(n:[*:1]):[*:2]', 'c(n:[*:1]):[*:2]', 'c(n:[*:2]):[*:1]'], 'R2': > ['c(n:[*:1]):[*:2]', 'c(n:[*:1]):[*:2]', 'c(n:[*:2]):[*:1]'], 'R3': > ['[H][*:3]', 'Cl[*:3]', '[H][*:3]']} Does anyone know if there’s a setting I can use to receive the old-style output, or could this be a bug? Thanks, Lauren Dr Lauren Reid Computational Chemist / Developer lauren.r...@medchemica.com www.medchemica.com Medchemica Ltd is a company registered in England and Wales with company number 8162245.
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss