Re: [Rdkit-discuss] question on rdRGroupDecomposition

Greg Landrum Tue, 15 May 2018 22:17:43 -0700

On Wed, May 16, 2018 at 4:24 AM Patrick Walters <[email protected]> wrote:


>
> Don't expend a lot of effort on this.
>

I'm primarily curious to understand why this isn't behaving as we expect it
to. It's either a bug or something that should be documented.


> I ended up writing my own implementation of R-group decomposition.
>

ouch... I'm sorry. You shouldn't need to do that.

-greg



> Pat
>
> On Tue, May 15, 2018 at 10:00 PM Greg Landrum <[email protected]>
> wrote:
>
>> Hi Pat,
>>
>> This one has me stumped.
>> @Brian: do you understand what's going on here or should I fire up the
>> debugger?
>>
>> -greg
>>
>>
>>
>> On Mon, May 14, 2018 at 4:24 AM Patrick Walters <[email protected]>
>> wrote:
>>
>>> Hi All,
>>>
>>> I'm hoping someone can help me with rdRGroupDecomposition.  I'd like to
>>> be able to specify specific R-group locations AND match cases where R=H.
>>>  The example below illustrates what I'm talking about.
>>> When RGroupDecompositionParameters.onlyMatchAtRGroups = True, cases where R
>>> == H are skipped.  I tried putting an explicit hydrogen on the core to
>>> block a position, but it appears that the explicit hydrogen is ignored.
>>>
>>> from rdkit import Chem
>>> from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition,
>>> RGroupDecompositionParameters
>>>
>>> # run an RGroupDecomposition on a set of molecules
>>> def process_r_groups(core_mol,rg_params,mols):
>>>     rg = RGroupDecomposition(core_mol,rg_params)
>>>     for mol in mol_list:
>>>         rg.Add(mol)
>>>     rg.Process()
>>>     return [x for x in rg.GetRGroupsAsRows(asSmiles=True)]
>>>
>>>
>>> buff = """CCc1ccnc(C)n1
>>> Cc1ncccn1
>>> Cc1cnc(C)nc1"""
>>>
>>> mol_list = [Chem.MolFromSmiles(x) for x in buff.split("\n")]
>>> core = Chem.MolFromSmiles("[H]c1cc([2*])nc([1*])n1")
>>> # default parameters, note that 3 R-groups are returned, the
>>> # explicit hydrogen is ignored
>>> params_1 = RGroupDecompositionParameters()
>>> for row in process_r_groups(core,params_1,mol_list):
>>>     print(row)
>>>
>>> print()
>>>
>>> params_2 = RGroupDecompositionParameters()
>>> params_2.onlyMatchAtRGroups = True
>>> # run with the onlyMatchAtRGroups parameter
>>> # now only one row is returned
>>> for row in process_r_groups(core,params_2,mol_list):
>>>     print(row)
>>>
>>> The output from the script above is
>>>
>>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>>> 'R2': '[H][*:2]', 'R3': '[H]C([H])([H])C([H])([H])[*:3]'}
>>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>>> 'R2': '[H][*:2]', 'R3': '[H][*:3]'}
>>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>>> 'R2': '[H]C([H])([H])[*:2]', 'R3': '[H][*:3]'}
>>>
>>> {'Core': 'c1cc([*:2])nc([*:1])n1', 'R1': '[H]C([H])([H])[*:1]', 'R2':
>>> '[H]C([H])([H])C([H])([H])[*:2]'}
>>>
>>> I'd like to figure out how I can only get the substituents at the
>>> labeled positions, but have it match where R1 == H or R2 == H.
>>>
>>> Thanks in advance,
>>>
>>> Pat
>>>
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> Rdkit-discuss mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Re: [Rdkit-discuss] question on rdRGroupDecomposition

Reply via email to