On Wed, Jun 11, 2014 at 4:35 AM, Nicholas Firth <[email protected]>
wrote:

> I want to show some numbers from a compatible fragmentation scheme to my
> own one. Which means generating all the leaves from the hierarchy and then
> doing some post processing to merge these fragments. This isn't a problem
> on some of the more drug like data sets, however with ChEMBL this is
> causing me some stress.
>

If you're ok using BRICS instead of RECAP, you can do something like this:

In [24]: mol = Chem.MolFromSmiles('CC[C@H](C)[C@H](NC(=O)[C@H
](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](N)CCSC)[C@@H](C)O)C(=O)NCC(=O)N[C@
@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc1c[nH]cn1)C(=O)N[C@
@H](CC(=O)N)C(=O)NCC(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@
@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@
@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@
@H](CCCN=C(N)N)C(=O)NCC(=O)N[C@@H](CCC(=O)N)C(=O)N[C@
@H](CC(C)C)C(=O)NCC(=O)N2CCC[C@H]2C(=O)N3CCC[C@H]3C(=O)NCC(=O)N[C@
@H](CO)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N')

In [25]: frags =
Chem.GetMolFrags(Chem.FragmentOnBRICSBonds(mol),asMols=True)

In [26]: smis = set([Chem.MolToSmiles(x,True) for x in frags])

In [27]: len(smis)
Out[27]: 17

In [28]: smis
Out[28]:
{'[1*]C(=O)C[4*]',
 '[1*]C(=O)[C@@H](N)CC[4*]',
 '[1*]C(=O)[C@@H]([4*])C',
 '[1*]C(=O)[C@@H]([4*])CC(C)C',
 '[1*]C(=O)[C@@H]([4*])CC(N)=O',
 '[1*]C(=O)[C@@H]([4*])CCC(N)=O',
 '[1*]C(=O)[C@@H]([4*])CCCN=C(N)N',
 '[1*]C(=O)[C@@H]([4*])CO',
 '[1*]C(=O)[C@@H]([4*])C[8*]',
 '[1*]C(=O)[C@H]([4*])[C@@H](C)CC',
 '[1*]C(=O)[C@H]([4*])[C@@H](C)O',
 '[1*]C([6*])=O',
 '[11*]SC',
 '[14*]c1c[nH]cn1',
 '[4*][C@@H](CCCN=C(N)N)C(N)=O',
 '[5*]N1CCC[C@@H]1[13*]',
 '[5*]N[5*]'}


Doing the same thing with the RECAP rules is not quite as trivial, but
should be doable

-greg
------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to