Hi all, I was playing around with the RGroup decomposition code and must say that I am pretty impressed by it. The fact that one can directly work with a MDL R-group file and that the output is a pandasDataFrame makes analysis really slick - well done !
However, one thing that irritates me is the fact that seemingly when I have R-groups defined in my core and enforce matching only at R-groups then molecules having hydrogen atoms in that position are ignored in the "Add" step. I would expect those to be included as long as the molecules don't have additional heavy atoms in positions that are not defined as R-groups in the core. ______________ snip ____________________ from rdkit import Chem from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, RGroupDecompositionParameters smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1ccccn1', 'Nc1ccc(Br)cn1', 'c1ccncc1'] mols = [Chem.MolFromSmiles(smi) for smi in smis] params = RGroupDecompositionParameters() params.onlyMatchAtRGroups = True # just atom number the rgroups core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1') rg1 = RGroupDecomposition(core1, params) failMols = [] for m in mols: res = rg1.Add(m) if res < 0: failMols.append(m) rg1.Process() print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols])) ____________ end snip ________________ the output shows that molecules 3-5 are not included at the "Add" step >> FailedMols: Nc1ccccn1 Nc1ccc(Br)cn1 c1ccncc1 For molecules 4 (the 5-bromo substituted aminopyridine) I agree, however I don't understand how I can make sure mols 3 and 5 are also included ... is there a magic parameter that I can set? Cheers Nik
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss