Hi Greg,

Don't expend a lot of effort on this.  I ended up writing my own
implementation of R-group decomposition.

Pat

On Tue, May 15, 2018 at 10:00 PM Greg Landrum <greg.land...@gmail.com>
wrote:

> Hi Pat,
>
> This one has me stumped.
> @Brian: do you understand what's going on here or should I fire up the
> debugger?
>
> -greg
>
>
>
> On Mon, May 14, 2018 at 4:24 AM Patrick Walters <wpwalt...@gmail.com>
> wrote:
>
>> Hi All,
>>
>> I'm hoping someone can help me with rdRGroupDecomposition.  I'd like to
>> be able to specify specific R-group locations AND match cases where R=H.
>>  The example below illustrates what I'm talking about.
>> When RGroupDecompositionParameters.onlyMatchAtRGroups = True, cases where R
>> == H are skipped.  I tried putting an explicit hydrogen on the core to
>> block a position, but it appears that the explicit hydrogen is ignored.
>>
>> from rdkit import Chem
>> from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition,
>> RGroupDecompositionParameters
>>
>> # run an RGroupDecomposition on a set of molecules
>> def process_r_groups(core_mol,rg_params,mols):
>>     rg = RGroupDecomposition(core_mol,rg_params)
>>     for mol in mol_list:
>>         rg.Add(mol)
>>     rg.Process()
>>     return [x for x in rg.GetRGroupsAsRows(asSmiles=True)]
>>
>>
>> buff = """CCc1ccnc(C)n1
>> Cc1ncccn1
>> Cc1cnc(C)nc1"""
>>
>> mol_list = [Chem.MolFromSmiles(x) for x in buff.split("\n")]
>> core = Chem.MolFromSmiles("[H]c1cc([2*])nc([1*])n1")
>> # default parameters, note that 3 R-groups are returned, the
>> # explicit hydrogen is ignored
>> params_1 = RGroupDecompositionParameters()
>> for row in process_r_groups(core,params_1,mol_list):
>>     print(row)
>>
>> print()
>>
>> params_2 = RGroupDecompositionParameters()
>> params_2.onlyMatchAtRGroups = True
>> # run with the onlyMatchAtRGroups parameter
>> # now only one row is returned
>> for row in process_r_groups(core,params_2,mol_list):
>>     print(row)
>>
>> The output from the script above is
>>
>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>> 'R2': '[H][*:2]', 'R3': '[H]C([H])([H])C([H])([H])[*:3]'}
>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>> 'R2': '[H][*:2]', 'R3': '[H][*:3]'}
>> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
>> 'R2': '[H]C([H])([H])[*:2]', 'R3': '[H][*:3]'}
>>
>> {'Core': 'c1cc([*:2])nc([*:1])n1', 'R1': '[H]C([H])([H])[*:1]', 'R2':
>> '[H]C([H])([H])C([H])([H])[*:2]'}
>>
>> I'd like to figure out how I can only get the substituents at the labeled
>> positions, but have it match where R1 == H or R2 == H.
>>
>> Thanks in advance,
>>
>> Pat
>>
>> ------------------------------------------------------------------------------
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> _______________________________________________
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to