Re: [Rdkit-discuss] How to generate bioisosters?
Hi Greg, I got it to work with my function “get_amide_biosters(smiles)” which I am happy to share once I clean it up a bit . Sorry I was not clear. both get_amide_biosters("CC(NC)=O") and get_amide_biosters("CC(OC)=O") provide the same recursive smiles: >> [$(CNC(C)C(F)(F)F),$(CC1=CN=C(C)C=N1),$(CC1=NN=C(C)N1),$(COC(C)=O),$(CC1=NOC(C)=N1),$(CNS(C)(=O)=O),$(COC1C=C(C)ON=1),$(CNC1(C)COC1),$(COC1(C)COC1),$(CC=C(C)F),$(CNC(C)=N),$(CNC(C)=O),$(CC1N=CC=NC=1C),$(CC1=NN=C(C)O1),$(COC1=CC(C)=NO1),$(CC1=CN=CC(C)=N1),$(CNC1(C)CC1),$(CON=C(C)C#N),$(CC1=CN(C)N=N1),$(CNC1=NC=C(C)O1),$(CC1=NN=NN1C)] This is what I want in order highlight similarity between compounds that are structurally different but probably biologically similar if they only differ by a bioisosteric moiety. Thanks again, Alexis On 6 February 2018 at 17:28, Greg Landrumwrote: > > > On Tue, Feb 6, 2018 at 10:42 AM, Alexis Parenty < > alexis.parenty.h...@gmail.com> wrote: > >> I will try your approach and will nest all the result smiles into a >> unique recursive smiles. >> > I'm not quite sure what you mean here, but it sounds unlikely to work. I > think you may need to do a SMILES for each of your isosteres. > After hitting "reply" on the earlier message I realized that this is a > nice example for a blog post, so I am going to put together some example > code that you may find useful. > I would be happy to do the example with your isosteres, but I need some > kind of explanation of what the actual substitution patterns are. For > example, you have these three "duplicates" in your original message; > 'C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1' > Should I interpret these as: > *C1=C(*)N=CC=N1 > *C1=CN=C(*)C=N1 > *C1=CN=CC(*) =N1 > ? > > If that's right, does direction matter? If I apply the second of those > possibilities - *C1=CN=C(*)C=N1 - to > ClC(=O)NBr > should it yield both: > ClC1=CN=C (Br)C=N1 > and > BrC1=CN=C(Cl)C=N1 > or just one of them? > > > Best, > -greg > > > > > > On 6 February 2018 at 07:54, Greg Landrum wrote: >> >>> Hi Alexis, >>> >>> If you have substructures with substitutions at a single atom, it tends >>> to be simpler to use Chem.ReplaceSubstructs to do this. This is, however, >>> not the case here (or in general for bioisosteric replacement) >>> >>> The most flexible way to do what you're looking for is to use the >>> RDKit's chemical reaction functionality and create one reaction per >>> isosteric replacement you want to make. Here's a simple example showing the >>> "amide -> 1,2,3 triazole" replacement: >>> >>> In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[* >>> :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1') >>> >>> >>> And here's how you'd use that to perform a replacement: >>> >>> In [12]: ps = triazoleRxn.RunReactants((Chem >>> .MolFromSmiles('C1CC1C(=O)Nc1c1'),)) >>> >>> In [13]: Chem.MolToSmiles(ps[0][0]) >>> Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1' >>> >>> >>> Notice that I added the bioisostere itself to the products of the >>> reaction as SMILES. You don't want query features there. >>> >>> I hope this helps, >>> -greg >>> >>> >>> >>> >>> On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty < >>> alexis.parenty.h...@gmail.com> wrote: >>> Dear RDKiters, I would like to generate the bioisosters of amides from a large list of structures: The smarts patterns for the bioisosters of amides I am interested in is: smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] How would you best "disconnect the amide moiety" of a structure and "replace" it with the recursive smarts pattern of its bioisosters? The bioster matching patern above must be contained within a single smarts, so I think I need recursive smarts, right? Any directions would be very much appreciated. Best, Alexis -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech
Re: [Rdkit-discuss] How to generate bioisosters?
On Tue, Feb 6, 2018 at 10:42 AM, Alexis Parenty < alexis.parenty.h...@gmail.com> wrote: > I will try your approach and will nest all the result smiles into a > unique recursive smiles. > I'm not quite sure what you mean here, but it sounds unlikely to work. I think you may need to do a SMILES for each of your isosteres. After hitting "reply" on the earlier message I realized that this is a nice example for a blog post, so I am going to put together some example code that you may find useful. I would be happy to do the example with your isosteres, but I need some kind of explanation of what the actual substitution patterns are. For example, you have these three "duplicates" in your original message; 'C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1' Should I interpret these as: *C1=C(*)N=CC=N1 *C1=CN=C(*)C=N1 *C1=CN=CC(*) =N1 ? If that's right, does direction matter? If I apply the second of those possibilities - *C1=CN=C(*)C=N1 - to ClC(=O)NBr should it yield both: ClC1=CN=C (Br)C=N1 and BrC1=CN=C(Cl)C=N1 or just one of them? Best, -greg On 6 February 2018 at 07:54, Greg Landrumwrote: > >> Hi Alexis, >> >> If you have substructures with substitutions at a single atom, it tends >> to be simpler to use Chem.ReplaceSubstructs to do this. This is, however, >> not the case here (or in general for bioisosteric replacement) >> >> The most flexible way to do what you're looking for is to use the RDKit's >> chemical reaction functionality and create one reaction per isosteric >> replacement you want to make. Here's a simple example showing the "amide -> >> 1,2,3 triazole" replacement: >> >> In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[* >> :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1') >> >> >> And here's how you'd use that to perform a replacement: >> >> In [12]: ps = triazoleRxn.RunReactants((Chem >> .MolFromSmiles('C1CC1C(=O)Nc1c1'),)) >> >> In [13]: Chem.MolToSmiles(ps[0][0]) >> Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1' >> >> >> Notice that I added the bioisostere itself to the products of the >> reaction as SMILES. You don't want query features there. >> >> I hope this helps, >> -greg >> >> >> >> >> On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty < >> alexis.parenty.h...@gmail.com> wrote: >> >>> Dear RDKiters, >>> >>> I would like to generate the bioisosters of amides from a large list of >>> structures: >>> >>> The smarts patterns for the bioisosters of amides I am interested in is: >>> >>> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', >>> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', >>> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', >>> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', >>> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', >>> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', >>> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', >>> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] >>> >>> How would you best "disconnect the amide moiety" of a structure and >>> "replace" it with the recursive smarts pattern of its bioisosters? The >>> bioster matching patern above must be contained within a single smarts, so >>> I think I need recursive smarts, right? >>> >>> Any directions would be very much appreciated. >>> >>> Best, >>> >>> Alexis >>> >>> >>> >>> >>> -- >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>> ___ >>> Rdkit-discuss mailing list >>> Rdkit-discuss@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >>> >>> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] How to generate bioisosters?
Hi Greg, Thanks a lot for your response, it helps a lot. I indeed notice it was more difficult with two attachment point (I got it working with acid bioisosters, one attachment point but could not make my head around with two). I will try your approach and will nest all the result smiles into a unique recursive smiles. Best, Alexis On 6 February 2018 at 07:54, Greg Landrumwrote: > Hi Alexis, > > If you have substructures with substitutions at a single atom, it tends to > be simpler to use Chem.ReplaceSubstructs to do this. This is, however, not > the case here (or in general for bioisosteric replacement) > > The most flexible way to do what you're looking for is to use the RDKit's > chemical reaction functionality and create one reaction per isosteric > replacement you want to make. Here's a simple example showing the "amide -> > 1,2,3 triazole" replacement: > > In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[* > :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1') > > > And here's how you'd use that to perform a replacement: > > In [12]: ps = triazoleRxn.RunReactants((Chem.MolFromSmiles('C1CC1C(=O)Nc1c > 1'),)) > > In [13]: Chem.MolToSmiles(ps[0][0]) > Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1' > > > Notice that I added the bioisostere itself to the products of the reaction > as SMILES. You don't want query features there. > > I hope this helps, > -greg > > > > > On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty < > alexis.parenty.h...@gmail.com> wrote: > >> Dear RDKiters, >> >> I would like to generate the bioisosters of amides from a large list of >> structures: >> >> The smarts patterns for the bioisosters of amides I am interested in is: >> >> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', >> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', >> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', >> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', >> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', >> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', >> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', >> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] >> >> How would you best "disconnect the amide moiety" of a structure and >> "replace" it with the recursive smarts pattern of its bioisosters? The >> bioster matching patern above must be contained within a single smarts, so >> I think I need recursive smarts, right? >> >> Any directions would be very much appreciated. >> >> Best, >> >> Alexis >> >> >> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> >> > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] How to generate bioisosters?
Hi Alexis, If you have substructures with substitutions at a single atom, it tends to be simpler to use Chem.ReplaceSubstructs to do this. This is, however, not the case here (or in general for bioisosteric replacement) The most flexible way to do what you're looking for is to use the RDKit's chemical reaction functionality and create one reaction per isosteric replacement you want to make. Here's a simple example showing the "amide -> 1,2,3 triazole" replacement: In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[* :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1') And here's how you'd use that to perform a replacement: In [12]: ps = triazoleRxn.RunReactants((Chem.MolFromSmiles('C1CC1C(=O) Nc1c1'),)) In [13]: Chem.MolToSmiles(ps[0][0]) Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1' Notice that I added the bioisostere itself to the products of the reaction as SMILES. You don't want query features there. I hope this helps, -greg On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty < alexis.parenty.h...@gmail.com> wrote: > Dear RDKiters, > > I would like to generate the bioisosters of amides from a large list of > structures: > > The smarts patterns for the bioisosters of amides I am interested in is: > > smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', > 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', > 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', > 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', > 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', > 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', > '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', > '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] > > How would you best "disconnect the amide moiety" of a structure and > "replace" it with the recursive smarts pattern of its bioisosters? The > bioster matching patern above must be contained within a single smarts, so > I think I need recursive smarts, right? > > Any directions would be very much appreciated. > > Best, > > Alexis > > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] How to generate bioisosters?
Dear Alexis, as far as I know, this would be the SMARTS string with recursive pattern: [$(C1=CN=[CH1][CH1]=N1),$(C1=[CH1]N=C[CH1]=N1),$(C1=[CH1]N=[CH1]C=N1),$(OC1=[CH1]C=NO1),$(OC1=NOC=[CH1]1),$(C1=CN=C([NX3])O1),$([#6X4]([#6X4](F)(F)(F))[NX3]),$(C(C#N)=NO),$(N1N=NN=C1),$(N1N=NC=[N]1),$(N1N=NC=[CH1]1),$(C1=NOC=N1),$(C1=NN=C[OX2]1),$(C1=NN=C[NH1]1),$([C]1(COC1)[OX2]),$([C]1(COC1)[NX3]),$(C([NX3H1])=N[CX4H3]),$([CX4]1([CX4][CX4]1)[NX3]),$([#16X4+2]([NX3])([OX1-])([OX1-])),$([#16X4]([NX3])(=[OX1])(=[OX1])),$([CX3](=[CX3])(F)),$([OX2][CX3](=[OX1])),$([NX3][CX3](=[OX1]))] I don't know what you want to match specifically, but some of your pattern won't match SMILES that are written as the aromatic form of the molecule instead of the localized form. Best wishes, Emanuel Alexis Parentyschrieb am Mo., 5. Feb. 2018 um 10:07 Uhr: > Dear RDKiters, > > I would like to generate the bioisosters of amides from a large list of > structures: > > The smarts patterns for the bioisosters of amides I am interested in is: > > smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', > 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', > 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', > 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', > 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', > 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', > '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', > '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] > > How would you best "disconnect the amide moiety" of a structure and > "replace" it with the recursive smarts pattern of its bioisosters? The > bioster matching patern above must be contained within a single smarts, so > I think I need recursive smarts, right? > > Any directions would be very much appreciated. > > Best, > > Alexis > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] How to generate bioisosters?
Dear RDKiters, I would like to generate the bioisosters of amides from a large list of structures: The smarts patterns for the bioisosters of amides I am interested in is: smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1', 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO', 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1', 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]', 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]', '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])', '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])'] How would you best "disconnect the amide moiety" of a structure and "replace" it with the recursive smarts pattern of its bioisosters? The bioster matching patern above must be contained within a single smarts, so I think I need recursive smarts, right? Any directions would be very much appreciated. Best, Alexis -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss