Hi Adrian, Thanks for looking at the RECAP stuff so carefully. It clearly needs someone to go over it with a fine-tooth comb.
This is indeed a bug. It actually looks like it's actually two bugs: 1) the RECAP algorithm breaks cyclic bonds, which it ought not to do. (http://sourceforge.net/tracker/index.php?func=detail&aid=1804418&group_id=160139&atid=814650) 2) there's something very wrong going on in the way reactions like this are handled: "[N;D3;R:1...@[*:2]>>[X][N:1].[*:2][X]" this one showed up when I attempted to fix the first problem. (http://sourceforge.net/tracker/index.php?func=detail&aid=1804420&group_id=160139&atid=814650) I believe this second bug arose during some recent changes I made to the way queries are stored, but it's going to take a bit of time to find and fix. I will take a look at it this weekend. Thanks again, -greg On 9/28/07, Adrian Schreyer <adr...@cryst.bioc.cam.ac.uk> wrote: > Apparently the RECAP implementation in RDKit cleaves bonds within ring > motifs in the case of amines, e.g. in the case of imatinib or in the > example given in the original paper. > > I tried to modify the SMARTS pattern a bit in order to reproduce the > cisapride example, without success. The original algorithm does not > seem to work in a step-wise manner leading to a tree-like result. That > is why a propyl group appears as a resulting fragment in the example, > which is within the molecule in the first place but would be a > terminal group in the RDKit implementation and therefore not be > cleaved of course. > > Cisapride: c...@h]1cn(CCCOc2ccc(F)cc2)c...@h]1nc(=O)c3cc(Cl)c(N)cc3OC > Imatinib: CN1CCN(Cc2ccc(cc2)C(=O)Nc3ccc(C)c(Nc4nccc(n4)c5cccnc5)c3)CC1 > > example from paper: > http://pubs.acs.org/isubscribe/journals/jcisd8/38/i03/figures/ci970429if00003.html > > RDKit fragments (Leaves) of cisapride: > CCCOc1ccc(F)cc1 > C(=O)c1c(OC)cc(N)c(Cl)c1 > CCC(N)C(OC)C > CC(OC)C(N)CC > > Adrian > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >