[Rdkit-discuss] RGroup matching in RGroup decomposition code

Stiefl, Nikolaus Tue, 11 Dec 2018 03:20:06 -0800

Hi all,

I was playing around with the RGroup decomposition code and must say that I am 
pretty impressed by it. The fact that one can directly work with a MDL R-group 
file and that the output is a pandasDataFrame makes analysis really slick - 
well done !


However, one thing that irritates me is the fact that seemingly when I have 
R-groups defined in my core and enforce matching only at R-groups then 
molecules having hydrogen atoms in that position are ignored in the "Add" step. 
I would expect those to be included as long as the molecules don't have 
additional heavy atoms in positions that are not defined as R-groups in the 
core.

______________ snip ____________________

from rdkit import Chem
from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, 
RGroupDecompositionParameters


smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1ccccn1', 'Nc1ccc(Br)cn1', 
'c1ccncc1']
mols = [Chem.MolFromSmiles(smi) for smi in smis]
params = RGroupDecompositionParameters()

params.onlyMatchAtRGroups = True

# just atom number the rgroups
core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
rg1 = RGroupDecomposition(core1, params)

failMols = []
for m in mols:
  res = rg1.Add(m)
  if res < 0:
    failMols.append(m)

rg1.Process()

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))

____________ end snip ________________


the output shows that molecules 3-5 are not included at the "Add" step

>> FailedMols: Nc1ccccn1 Nc1ccc(Br)cn1 c1ccncc1

For molecules 4 (the 5-bromo substituted aminopyridine) I agree, however I 
don't understand how I can make sure mols 3 and 5 are also included ... is 
there a magic parameter that I can set?

Cheers
Nik

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

[Rdkit-discuss] RGroup matching in RGroup decomposition code

Reply via email to