Thank you very much. I will go over these instructions. Best wishes Carsten
> On 7 Jul 2022, at 02:20, Patrick Walters <wpwalt...@gmail.com> wrote: > > Here's a simple example showing the enumeration of a 3 component library > based on a reaction > https://gist.github.com/PatWalters/7439099598b4f08a331a81b209f88baa > <https://gist.github.com/PatWalters/7439099598b4f08a331a81b209f88baa> > > > On Wed, Jul 6, 2022 at 4:57 PM Andrew Dalke <da...@dalkescientific.com > <mailto:da...@dalkescientific.com>> wrote: > Hi Carsten, > > How are the fragments expressed? With attachment points marked with > "[*:1]", "[*:2]" and "[*:3]" atoms? > > One technique is to rewrite the SMILES to use closures. (See > https://onlinelibrary.wiley.com/doi/10.1002/qsar.200310008 > <https://onlinelibrary.wiley.com/doi/10.1002/qsar.200310008> or > http://www.dalkescientific.com/writings/diary/archive/2005/05/07/attachment_points.html > > <http://www.dalkescientific.com/writings/diary/archive/2005/05/07/attachment_points.html> > ). > > For example, if your core SMILES are: > > [*:1]c1ncc([*:2])cn1 > CC([*:2])O[*:1] > > and your R1 contains > > *F > Cl* > Br* > > and your R2 contains > > *CCO > CO* > > then you could rewrite these to use "%91" to connect the [*:1] with the R1 > "*" and use "%92" to connect the [*:2] with the R2 "*", using > dot-disconnected terms. > > For example: > > [*:1]c1ncc([*:2])cn1 + *F + *CCO > > can be rewritten as > > c%911ncc%92cn1.F%91.C%92CO > > which is parsed and canonicalized to: > > OCCc1cnc(F)nc1 > > Rewriting the SMILES this way is a bit tricky. I've attached a program which > does it for you. > > > Running it on the above gives: > > % cat core.smi > [*:1]c1ncc([*:2])cn1 > CC([*:2])N[*:1] > > % cat r1.smi > *F > Cl* > Br* > > % cat r2.smi > *CCO > CO* > > % python enumerate.py --R1 r1.smi --R2 r2.smi core.smi > c1%91ncc%92cn1.F%91.C%92CO -> OCCc1cnc(F)nc1 > c1%91ncc%92cn1.F%91.CO <http://91.co/>%92 -> COc1cnc(F)nc1 > c1%91ncc%92cn1.Cl%91.C%92CO -> OCCc1cnc(Cl)nc1 > c1%91ncc%92cn1.Cl%91.CO <http://91.co/>%92 -> COc1cnc(Cl)nc1 > c1%91ncc%92cn1.Br%91.C%92CO -> OCCc1cnc(Br)nc1 > c1%91ncc%92cn1.Br%91.CO <http://91.co/>%92 -> COc1cnc(Br)nc1 > CC(O%91)%92.F%91.C%92CO -> CC(CCO)OF > CC(O%91)%92.F%91.CO <http://91.co/>%92 -> COC(C)OF > CC(O%91)%92.Cl%91.C%92CO -> CC(CCO)OCl > CC(O%91)%92.Cl%91.CO <http://91.co/>%92 -> COC(C)OCl > CC(O%91)%92.Br%91.C%92CO -> CC(CCO)OBr > CC(O%91)%92.Br%91.CO <http://91.co/>%92 -> COC(C)OBr > > It also supports --R3 if your core has 3 R-groups, with the third core point > labeled [*:3]. > > Best regards > > > Andrew > da...@dalkescientific.com > <mailto:da...@dalkescientific.com> > > > > > > > On Jul 6, 2022, at 21:00, Carsten Bauer <carsten.ba...@bluewin.ch > > <mailto:carsten.ba...@bluewin.ch>> wrote: > > > > Hello > > > > I have a structure with three substituents R1, R2 and R3 > > R1 is an enumeration of 30+ SMILES > > R2 and R3 each is an enumeration of <5 SMILES > > Chemical space = 30 x 5 x 5 = 750+ in-silico compounds > > > > Can anyone share (i.e publish in a citable form) an RDKit code for this > > permutation? > > Is there a textbook example illustrating this daily question from the lab > > in an example, please? > > > > I can’t follow > > https://www.rdkit.org/docs/cppapi/EnumerationStrategyBase_8h_source.html > > <https://www.rdkit.org/docs/cppapi/EnumerationStrategyBase_8h_source.html> > > > > Sorry. > > > > Many thanks for getting back. > > Kindest regards > > C. > > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > <mailto:Rdkit-discuss@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss