Hi Pat,

This one has me stumped.
@Brian: do you understand what's going on here or should I fire up the
debugger?

-greg



On Mon, May 14, 2018 at 4:24 AM Patrick Walters <wpwalt...@gmail.com> wrote:

> Hi All,
>
> I'm hoping someone can help me with rdRGroupDecomposition.  I'd like to be
> able to specify specific R-group locations AND match cases where R=H.   The
> example below illustrates what I'm talking about.
> When RGroupDecompositionParameters.onlyMatchAtRGroups = True, cases where R
> == H are skipped.  I tried putting an explicit hydrogen on the core to
> block a position, but it appears that the explicit hydrogen is ignored.
>
> from rdkit import Chem
> from rdkit.Chem.rdRGroupDecomposition import RGroupDecomposition,
> RGroupDecompositionParameters
>
> # run an RGroupDecomposition on a set of molecules
> def process_r_groups(core_mol,rg_params,mols):
>     rg = RGroupDecomposition(core_mol,rg_params)
>     for mol in mol_list:
>         rg.Add(mol)
>     rg.Process()
>     return [x for x in rg.GetRGroupsAsRows(asSmiles=True)]
>
>
> buff = """CCc1ccnc(C)n1
> Cc1ncccn1
> Cc1cnc(C)nc1"""
>
> mol_list = [Chem.MolFromSmiles(x) for x in buff.split("\n")]
> core = Chem.MolFromSmiles("[H]c1cc([2*])nc([1*])n1")
> # default parameters, note that 3 R-groups are returned, the
> # explicit hydrogen is ignored
> params_1 = RGroupDecompositionParameters()
> for row in process_r_groups(core,params_1,mol_list):
>     print(row)
>
> print()
>
> params_2 = RGroupDecompositionParameters()
> params_2.onlyMatchAtRGroups = True
> # run with the onlyMatchAtRGroups parameter
> # now only one row is returned
> for row in process_r_groups(core,params_2,mol_list):
>     print(row)
>
> The output from the script above is
>
> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
> 'R2': '[H][*:2]', 'R3': '[H]C([H])([H])C([H])([H])[*:3]'}
> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
> 'R2': '[H][*:2]', 'R3': '[H][*:3]'}
> {'Core': '*c1nc([*:1])nc([*:3])c1[*:2]', 'R1': '[H]C([H])([H])[*:1]',
> 'R2': '[H]C([H])([H])[*:2]', 'R3': '[H][*:3]'}
>
> {'Core': 'c1cc([*:2])nc([*:1])n1', 'R1': '[H]C([H])([H])[*:1]', 'R2':
> '[H]C([H])([H])C([H])([H])[*:2]'}
>
> I'd like to figure out how I can only get the substituents at the labeled
> positions, but have it match where R1 == H or R2 == H.
>
> Thanks in advance,
>
> Pat
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to