Dear Simone,
On Wed, Nov 14, 2012 at 6:27 PM, Simone Fulle <[email protected]>wrote:
> Dear RDkitters,
> I am currently exploring the RECAP functionality to fragment molecules and
> would like to keep the functional group information instead of the dummy
> atoms (e.g. the hole amide group instead of CO or N, respectively). What
> would be the best way to replace the dummy atom in the leaves-smiles-string
> by the original splitted functional group, e.g. by finding neighbouring
> atoms in higher levels of the hierarch tree? Other recommendations or
> solution suggestions are of course also welcome :-)
>
I'm afraid that there's not currently any way to do this. However, it would
probably be pretty easy for you to get the functionality you want.
The RECAP code works by applying a series of reactions to the input
molecules, you would just need to change the definitions of those reactions.
Here's the current behavior of the RECAP code;
In [3]: m = Chem.MolFromSmiles('c1ccccc1NC(=O)C1CC1')
In [4]: t = Recap.RecapDecompose(m)
In [5]: t.GetLeaves().keys()
Out[5]: ['[*]Nc1ccccc1', '[*]C(=O)C1CC1']
Splitting an amide is governed by this reaction, which you can find in
$RDBASE/rdkit/Chem/Recap.py:
"[C;!$(C([#7])[#7]):1](=!@[O:2])!@[#7;+0;!D1:3]>>[*][C:1]=[O:2].[*][#7:3]",
# amide
If I replace that reaction:
In [30]: l = list(Recap.reactionDefs)
In [31]: l[1] =
'[!#0;!#1:5][C;!$(C([#7])[#7]):1](=!@[O:2])!@[#7;+0;!D1:3]-[!#0;!#1:4]>>[*]N-[C:1](=[O:2])[*:5].[*]C(=O)[#7:3]-[*:4]'
In [32]: Recap.reactions = tuple([AllChem.ReactionFromSmarts(x) for x in l])
I get the answer that I think you're looking for:
In [33]: t = Recap.RecapDecompose(m)
In [34]: t.GetLeaves().keys()
Out[34]: ['[*]NC(=O)C1CC1', '[*]C(=O)Nc1ccccc1']
Notice that I had to add a couple of atoms to the reactant query (left-hand
side of the reaction) to prevent the rule from matching the products.
If you create your own list of reactions following the prototype above, you
can either replace the ones in the RDKit source file or, probably better,
store them somewhere locally and replace the reactions at runtime as I
demonstrate above.
If you're not actually interested in having the extra atoms in the product
molecules but you just want to know which RECAP rules were applied to split
the molecule, you could instead edit the rules to isotope tag the dummy
atom that's inserted like this:
In [46]: l[1] =
"[C;!$(C([#7])[#7]):1](=!@[O:2])!@[#7;+0;!D1:3]>>[4*][C:1]=[O:2].[4*][#7:3]"
In [47]: l[3] =
"[N;!D1;+0;!$(N-C=[#7,#8,#15,#16])](-!@[!#0;!#1:1])-!@[!#0;!#1:2]>>[5*][*:1].[*:2][5*]"
In [48]: t = Recap.RecapDecompose(m)
In [49]: t.GetLeaves().keys()
Out[49]: ['[4*]Nc1ccccc1', '[4*]C(=O)C1CC1']
Now every time you see a dummy atom with isotope 4 in the output, you know
it came from splitting an amide bond.
Notice here that I also had to edit the amine splitting rule (l[3]))
I hope this helps,
-greg
------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss