Hi Nik,
There is a way to achieve what you describe, even though it is slightly
cumbersome:
from rdkit import Chem
from rdkit.Chem import rdmolops
from rdkit.Chem.Draw import MolsToGridImage, IPythonConsole
from rdkit.Chem.rdRGroupDecomposition import (
RGroupDecomposition, RGroupDecompositionParameters)
smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1ccccn1',
'Nc1ccc(Br)cn1', 'c1ccncc1']
mols = [Chem.MolFromSmiles(smi) for smi in smis]
MolsToGridImage(mols)
params = RGroupDecompositionParameters()
# rather than using the built-in flag we will manually
# adjust the query in two steps using AdjustQueryProperties()
params.onlyMatchAtRGroups = False
# just atom number the rgroups
core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
# make dummies queries
core1_params = rdmolops.AdjustQueryParameters()
core1_params.makeDummiesQueries = True
core1_params.adjustDegree = False
core1 = rdmolops.AdjustQueryProperties(core1, core1_params)
# change the atoms connected to the dummies into dummies
former_atomic_nums = {}
for b in core1.GetBonds():
if (b.GetBeginAtom().GetAtomicNum() == 0):
a = b.GetEndAtom()
elif (b.GetEndAtom().GetAtomicNum() == 0):
a = b.GetBeginAtom()
else:
continue
former_atomic_nums[a.GetIdx()] = a.GetAtomicNum()
a.SetAtomicNum(0)
# this has the same effect as setting onlyMatchAtRGroups to True
# but we can avoid applying it the atoms connected to the R groups
core1_params.adjustHeavyDegreeFlags = Chem.ADJUST_IGNOREDUMMIES
core1_params.makeDummiesQueries = False
core1_params.adjustDegree = False
core1_params.adjustHeavyDegree = True
core1 = rdmolops.AdjustQueryProperties(core1, core1_params)
# restore the original atomic numbers
for i, an in former_atomic_nums.items():
core1.GetAtomWithIdx(i).SetAtomicNum(an)
rg1 = RGroupDecomposition(core1, params)
failMols = []
for m in mols:
res = rg1.Add(m)
if res < 0:
failMols.append(m)
rg1.Process()
True
print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))
FailedMols: Nc1ccc(Br)cn1
core1
d = rg1.GetRGroupsAsColumns(asSmiles=False)
MolsToGridImage(d['Core'])
MolsToGridImage(d['R1'])
MolsToGridImage(d['R2'])
Hope that helps, cheers
p.
On 12/11/18 11:01, Stiefl, Nikolaus wrote:
Hi all,
I was playing around with the RGroup decomposition code and must say
that I am pretty impressed by it. The fact that one can directly work
with a MDL R-group file and that the output is a pandasDataFrame makes
analysis really slick – well done !
However, one thing that irritates me is the fact that seemingly when I
have R-groups defined in my core and enforce matching only at R-groups
then molecules having hydrogen atoms in that position are ignored in
the “Add” step. I would expect those to be included as long as the
molecules don’t have additional heavy atoms in positions that are not
defined as R-groups in the core.
______________ snip ____________________
from rdkit import Chem
from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition,
RGroupDecompositionParameters
smis = ['Cc1ccnc(O)c1', 'Cc1cc(Cl)ccn1', 'Nc1ccccn1', 'Nc1ccc(Br)cn1',
'c1ccncc1']
mols = [Chem.MolFromSmiles(smi) for smi in smis]
params = RGroupDecompositionParameters()
params.onlyMatchAtRGroups = True
# just atom number the rgroups
core1 = Chem.MolFromSmiles('n1ccc([*:2])cc([*:1])1')
rg1 = RGroupDecomposition(core1, params)
failMols = []
for m in mols:
res = rg1.Add(m)
if res < 0:
failMols.append(m)
rg1.Process()
print("FailedMols: %s"%" ".join([Chem.MolToSmiles(m) for m in failMols]))
____________ end snip ________________
the output shows that molecules 3-5 are not included at the “Add” step
>> FailedMols: Nc1ccccn1 Nc1ccc(Br)cn1 c1ccncc1
For molecules 4 (the 5-bromo substituted aminopyridine) I agree,
however I don’t understand how I can make sure mols 3 and 5 are also
included … is there a magic parameter that I can set?
Cheers
Nik
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss