Hi Nik,

There is a way to achieve what you describe, even though it is slightly cumbersome:

from  rdkit  import  Chem
from  rdkit.Chem  import  rdmolops
from  rdkit.Chem.Draw  import  MolsToGridImage,  IPythonConsole
from  rdkit.Chem.rdRGroupDecomposition  import  (
    RGroupDecomposition,  RGroupDecompositionParameters)

smis  =  ['Cc1ccnc(O)c1',  'Cc1cc(Cl)ccn1',  'Nc1ccccn1',
        'Nc1ccc(Br)cn1',  'c1ccncc1']
mols  =  [Chem.MolFromSmiles(smi)  for  smi  in  smis]

MolsToGridImage(mols)

params  =  RGroupDecompositionParameters()
# rather than using the built-in flag we will manually
# adjust the query in two steps using AdjustQueryProperties()
params.onlyMatchAtRGroups  =  False
# just atom number the rgroups
core1  =  Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
# make dummies queries
core1_params  =  rdmolops.AdjustQueryParameters()
core1_params.makeDummiesQueries  =  True
core1_params.adjustDegree  =  False
core1  =  rdmolops.AdjustQueryProperties(core1,  core1_params)
# change the atoms connected to the dummies into dummies
former_atomic_nums  =  {}
for  b  in  core1.GetBonds():
    if  (b.GetBeginAtom().GetAtomicNum()  ==  0):
        a  =  b.GetEndAtom()
    elif  (b.GetEndAtom().GetAtomicNum()  ==  0):
        a  =  b.GetBeginAtom()
    else:
        continue
    former_atomic_nums[a.GetIdx()]  =  a.GetAtomicNum()
    a.SetAtomicNum(0)
# this has the same effect as setting onlyMatchAtRGroups to True
# but we can avoid applying it the atoms connected to the R groups
core1_params.adjustHeavyDegreeFlags  =  Chem.ADJUST_IGNOREDUMMIES
core1_params.makeDummiesQueries  =  False
core1_params.adjustDegree  =  False
core1_params.adjustHeavyDegree  =  True
core1  =  rdmolops.AdjustQueryProperties(core1,  core1_params)
# restore the original atomic numbers
for  i,  an  in  former_atomic_nums.items():
    core1.GetAtomWithIdx(i).SetAtomicNum(an)
rg1  =  RGroupDecomposition(core1,  params)
failMols  =  []
for  m  in  mols:
    res  =  rg1.Add(m)
    if  res  <  0:
        failMols.append(m)
rg1.Process()

True

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m)  for  m  in  failMols]))

FailedMols: Nc1ccc(Br)cn1

core1

d  =  rg1.GetRGroupsAsColumns(asSmiles=False)

MolsToGridImage(d['Core'])

MolsToGridImage(d['R1'])

MolsToGridImage(d['R2'])


Hope that helps, cheers
p.

On 12/11/18 11:01, Stiefl, Nikolaus wrote:

Hi all,

I was playing around with the RGroup decomposition code and must say that I am pretty impressed by it. The fact that one can directly work with a MDL R-group file and that the output is a pandasDataFrame makes analysis really slick – well done !

However, one thing that irritates me is the fact that seemingly when I have R-groups defined in my core and enforce matching only at R-groups then molecules having hydrogen atoms in that position are ignored in the “Add” step. I would expect those to be included as long as the molecules don’t have additional heavy atoms in positions that are not defined as R-groups in the core.

______________ snip ____________________

from rdkit import Chem

from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition, RGroupDecompositionParameters

smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1ccccn1', 'Nc1ccc(Br)cn1', 'c1ccncc1']

mols = [Chem.MolFromSmiles(smi) for smi in smis]

params = RGroupDecompositionParameters()

params.onlyMatchAtRGroups = True

# just atom number the rgroups

core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')

rg1 = RGroupDecomposition(core1, params)

failMols = []

for m in mols:

res = rg1.Add(m)

if res < 0:

failMols.append(m)

rg1.Process()

print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))

____________ end snip ________________

the output shows that molecules 3-5 are not included at the “Add” step

>> FailedMols: Nc1ccccn1 Nc1ccc(Br)cn1 c1ccncc1

For molecules 4 (the 5-bromo substituted aminopyridine) I agree, however I don’t understand how I can make sure mols 3 and 5 are also included … is there a magic parameter that I can set?

Cheers

Nik





_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to