Re: [Rdkit-discuss] How to generate bioisosters?

2018-02-06 Thread Alexis Parenty
Hi Greg, I got it to work with my function “get_amide_biosters(smiles)”
which I am happy to share once I clean it up a bit . Sorry I was not clear.



both get_amide_biosters("CC(NC)=O") and get_amide_biosters("CC(OC)=O")
provide the same recursive smiles:

>>
[$(CNC(C)C(F)(F)F),$(CC1=CN=C(C)C=N1),$(CC1=NN=C(C)N1),$(COC(C)=O),$(CC1=NOC(C)=N1),$(CNS(C)(=O)=O),$(COC1C=C(C)ON=1),$(CNC1(C)COC1),$(COC1(C)COC1),$(CC=C(C)F),$(CNC(C)=N),$(CNC(C)=O),$(CC1N=CC=NC=1C),$(CC1=NN=C(C)O1),$(COC1=CC(C)=NO1),$(CC1=CN=CC(C)=N1),$(CNC1(C)CC1),$(CON=C(C)C#N),$(CC1=CN(C)N=N1),$(CNC1=NC=C(C)O1),$(CC1=NN=NN1C)]



This is what I want in order highlight similarity between compounds that
are structurally different but probably biologically similar if they only
differ by a bioisosteric moiety.

Thanks again,

Alexis




On 6 February 2018 at 17:28, Greg Landrum  wrote:

>
>
> On Tue, Feb 6, 2018 at 10:42 AM, Alexis Parenty <
> alexis.parenty.h...@gmail.com> wrote:
>
>>  I will try your approach and will nest all the result smiles  into a
>> unique recursive smiles.
>>
> I'm not quite sure what you mean here, but it sounds unlikely to work. I
> think you may need to do a SMILES for each of your isosteres.
> After hitting "reply" on the earlier message I realized that this is a
> nice example for a blog post, so I am going to put together some example
> code that you may find useful.
> I would be happy to do the example with your isosteres, but I need some
> kind of explanation of what the actual substitution patterns are. For
> example, you have these three "duplicates" in your original message;
> 'C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1'
> Should I interpret these as:
> *C1=C(*)N=CC=N1
> *C1=CN=C(*)C=N1
> *C1=CN=CC(*) =N1
> ?
>
> If that's right, does direction matter? If I apply the second of those
> possibilities - *C1=CN=C(*)C=N1 - to
> ClC(=O)NBr
> should it yield both:
> ClC1=CN=C (Br)C=N1
> and
> BrC1=CN=C(Cl)C=N1
> or just one of them?
>
>
> Best,
> -greg
>
>
>
>
>
> On 6 February 2018 at 07:54, Greg Landrum  wrote:
>>
>>> Hi Alexis,
>>>
>>> If you have substructures with substitutions at a single atom, it tends
>>> to be simpler to use Chem.ReplaceSubstructs to do this. This is, however,
>>> not the case here (or in general for bioisosteric replacement)
>>>
>>> The most flexible way to do what you're looking for is to use the
>>> RDKit's chemical reaction functionality and create one reaction per
>>> isosteric replacement you want to make. Here's a simple example showing the
>>> "amide -> 1,2,3 triazole" replacement:
>>>
>>> In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[*
>>> :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1')
>>>
>>>
>>> And here's how you'd use that to perform a replacement:
>>>
>>> In [12]: ps = triazoleRxn.RunReactants((Chem
>>> .MolFromSmiles('C1CC1C(=O)Nc1c1'),))
>>>
>>> In [13]: Chem.MolToSmiles(ps[0][0])
>>> Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1'
>>>
>>>
>>> Notice that I added the bioisostere itself to the products of the
>>> reaction as SMILES. You don't want query features there.
>>>
>>> I hope this helps,
>>> -greg
>>>
>>>
>>>
>>>
>>> On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty <
>>> alexis.parenty.h...@gmail.com> wrote:
>>>
 Dear RDKiters,

 I would like to generate the bioisosters of amides from a large list of
 structures:

 The smarts patterns for the bioisosters  of amides I am interested in
 is:

 smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
 '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
 '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']

 How would you best "disconnect the amide moiety" of a structure and
 "replace" it with the recursive smarts pattern of its bioisosters? The
 bioster matching patern above must be contained within a single smarts, so
 I think I need recursive smarts, right?

 Any directions would be very much appreciated.

 Best,

 Alexis



 
 --
 Check out the vibrant tech community on one of the world's most
 engaging tech sites, Slashdot.org! http://sdm.link/slashdot
 ___
 Rdkit-discuss mailing list
 Rdkit-discuss@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


>>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech 

Re: [Rdkit-discuss] How to generate bioisosters?

2018-02-06 Thread Greg Landrum
On Tue, Feb 6, 2018 at 10:42 AM, Alexis Parenty <
alexis.parenty.h...@gmail.com> wrote:

>  I will try your approach and will nest all the result smiles  into a
> unique recursive smiles.
>
I'm not quite sure what you mean here, but it sounds unlikely to work. I
think you may need to do a SMILES for each of your isosteres.
After hitting "reply" on the earlier message I realized that this is a nice
example for a blog post, so I am going to put together some example code
that you may find useful.
I would be happy to do the example with your isosteres, but I need some
kind of explanation of what the actual substitution patterns are. For
example, you have these three "duplicates" in your original message;
'C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1', 'C1=[CH1]N=[CH1]C=N1'
Should I interpret these as:
*C1=C(*)N=CC=N1
*C1=CN=C(*)C=N1
*C1=CN=CC(*) =N1
?

If that's right, does direction matter? If I apply the second of those
possibilities - *C1=CN=C(*)C=N1 - to
ClC(=O)NBr
should it yield both:
ClC1=CN=C (Br)C=N1
and
BrC1=CN=C(Cl)C=N1
or just one of them?


Best,
-greg





On 6 February 2018 at 07:54, Greg Landrum  wrote:
>
>> Hi Alexis,
>>
>> If you have substructures with substitutions at a single atom, it tends
>> to be simpler to use Chem.ReplaceSubstructs to do this. This is, however,
>> not the case here (or in general for bioisosteric replacement)
>>
>> The most flexible way to do what you're looking for is to use the RDKit's
>> chemical reaction functionality and create one reaction per isosteric
>> replacement you want to make. Here's a simple example showing the "amide ->
>> 1,2,3 triazole" replacement:
>>
>> In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[*
>> :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1')
>>
>>
>> And here's how you'd use that to perform a replacement:
>>
>> In [12]: ps = triazoleRxn.RunReactants((Chem
>> .MolFromSmiles('C1CC1C(=O)Nc1c1'),))
>>
>> In [13]: Chem.MolToSmiles(ps[0][0])
>> Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1'
>>
>>
>> Notice that I added the bioisostere itself to the products of the
>> reaction as SMILES. You don't want query features there.
>>
>> I hope this helps,
>> -greg
>>
>>
>>
>>
>> On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty <
>> alexis.parenty.h...@gmail.com> wrote:
>>
>>> Dear RDKiters,
>>>
>>> I would like to generate the bioisosters of amides from a large list of
>>> structures:
>>>
>>> The smarts patterns for the bioisosters  of amides I am interested in is:
>>>
>>> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
>>> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
>>> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
>>> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
>>> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
>>> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
>>> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
>>> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']
>>>
>>> How would you best "disconnect the amide moiety" of a structure and
>>> "replace" it with the recursive smarts pattern of its bioisosters? The
>>> bioster matching patern above must be contained within a single smarts, so
>>> I think I need recursive smarts, right?
>>>
>>> Any directions would be very much appreciated.
>>>
>>> Best,
>>>
>>> Alexis
>>>
>>>
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to generate bioisosters?

2018-02-06 Thread Alexis Parenty
Hi Greg,

Thanks a lot for your response, it helps a lot. I indeed notice it was more
difficult with two attachment point (I got it working with acid
bioisosters, one attachment point but could not make my head around with
two). I will try your approach and will nest all the result smiles  into a
unique recursive smiles.

Best,

Alexis



On 6 February 2018 at 07:54, Greg Landrum  wrote:

> Hi Alexis,
>
> If you have substructures with substitutions at a single atom, it tends to
> be simpler to use Chem.ReplaceSubstructs to do this. This is, however, not
> the case here (or in general for bioisosteric replacement)
>
> The most flexible way to do what you're looking for is to use the RDKit's
> chemical reaction functionality and create one reaction per isosteric
> replacement you want to make. Here's a simple example showing the "amide ->
> 1,2,3 triazole" replacement:
>
> In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[*
> :1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1')
>
>
> And here's how you'd use that to perform a replacement:
>
> In [12]: ps = triazoleRxn.RunReactants((Chem.MolFromSmiles('C1CC1C(=O)Nc1c
> 1'),))
>
> In [13]: Chem.MolToSmiles(ps[0][0])
> Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1'
>
>
> Notice that I added the bioisostere itself to the products of the reaction
> as SMILES. You don't want query features there.
>
> I hope this helps,
> -greg
>
>
>
>
> On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty <
> alexis.parenty.h...@gmail.com> wrote:
>
>> Dear RDKiters,
>>
>> I would like to generate the bioisosters of amides from a large list of
>> structures:
>>
>> The smarts patterns for the bioisosters  of amides I am interested in is:
>>
>> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
>> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
>> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
>> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
>> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
>> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
>> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
>> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']
>>
>> How would you best "disconnect the amide moiety" of a structure and
>> "replace" it with the recursive smarts pattern of its bioisosters? The
>> bioster matching patern above must be contained within a single smarts, so
>> I think I need recursive smarts, right?
>>
>> Any directions would be very much appreciated.
>>
>> Best,
>>
>> Alexis
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to generate bioisosters?

2018-02-05 Thread Greg Landrum
Hi Alexis,

If you have substructures with substitutions at a single atom, it tends to
be simpler to use Chem.ReplaceSubstructs to do this. This is, however, not
the case here (or in general for bioisosteric replacement)

The most flexible way to do what you're looking for is to use the RDKit's
chemical reaction functionality and create one reaction per isosteric
replacement you want to make. Here's a simple example showing the "amide ->
1,2,3 triazole" replacement:

In [11]: triazoleRxn = AllChem.ReactionFromSmarts('[*
:1]C(=O)N[*:2]>>[*:1]c1cn([*:2])nn1')


And here's how you'd use that to perform a replacement:

In [12]: ps = triazoleRxn.RunReactants((Chem.MolFromSmiles('C1CC1C(=O)
Nc1c1'),))

In [13]: Chem.MolToSmiles(ps[0][0])
Out[13]: 'c1ccc(-n2cc(C3CC3)nn2)cc1'


Notice that I added the bioisostere itself to the products of the reaction
as SMILES. You don't want query features there.

I hope this helps,
-greg




On Mon, Feb 5, 2018 at 10:07 AM, Alexis Parenty <
alexis.parenty.h...@gmail.com> wrote:

> Dear RDKiters,
>
> I would like to generate the bioisosters of amides from a large list of
> structures:
>
> The smarts patterns for the bioisosters  of amides I am interested in is:
>
> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']
>
> How would you best "disconnect the amide moiety" of a structure and
> "replace" it with the recursive smarts pattern of its bioisosters? The
> bioster matching patern above must be contained within a single smarts, so
> I think I need recursive smarts, right?
>
> Any directions would be very much appreciated.
>
> Best,
>
> Alexis
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] How to generate bioisosters?

2018-02-05 Thread Emanuel Ehmki
Dear Alexis,

as far as I know, this would be the SMARTS string with recursive pattern:
[$(C1=CN=[CH1][CH1]=N1),$(C1=[CH1]N=C[CH1]=N1),$(C1=[CH1]N=[CH1]C=N1),$(OC1=[CH1]C=NO1),$(OC1=NOC=[CH1]1),$(C1=CN=C([NX3])O1),$([#6X4]([#6X4](F)(F)(F))[NX3]),$(C(C#N)=NO),$(N1N=NN=C1),$(N1N=NC=[N]1),$(N1N=NC=[CH1]1),$(C1=NOC=N1),$(C1=NN=C[OX2]1),$(C1=NN=C[NH1]1),$([C]1(COC1)[OX2]),$([C]1(COC1)[NX3]),$(C([NX3H1])=N[CX4H3]),$([CX4]1([CX4][CX4]1)[NX3]),$([#16X4+2]([NX3])([OX1-])([OX1-])),$([#16X4]([NX3])(=[OX1])(=[OX1])),$([CX3](=[CX3])(F)),$([OX2][CX3](=[OX1])),$([NX3][CX3](=[OX1]))]

I don't know what you want to match specifically, but some of your pattern
won't match SMILES that are written as the aromatic form of the molecule
instead of the localized form.

Best wishes,
Emanuel

Alexis Parenty  schrieb am Mo., 5. Feb. 2018
um 10:07 Uhr:

> Dear RDKiters,
>
> I would like to generate the bioisosters of amides from a large list of
> structures:
>
> The smarts patterns for the bioisosters  of amides I am interested in is:
>
> smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
> 'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
> 'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
> 'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
> 'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
> 'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
> '[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
> '[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']
>
> How would you best "disconnect the amide moiety" of a structure and
> "replace" it with the recursive smarts pattern of its bioisosters? The
> bioster matching patern above must be contained within a single smarts, so
> I think I need recursive smarts, right?
>
> Any directions would be very much appreciated.
>
> Best,
>
> Alexis
>
>
>
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


[Rdkit-discuss] How to generate bioisosters?

2018-02-05 Thread Alexis Parenty
Dear RDKiters,

I would like to generate the bioisosters of amides from a large list of
structures:

The smarts patterns for the bioisosters  of amides I am interested in is:

smarts_path = ['C1=CN=[CH1][CH1]=N1', 'C1=[CH1]N=C[CH1]=N1',
'C1=[CH1]N=[CH1]C=N1', 'OC1=[CH1]C=NO1', 'OC1=NOC=[CH1]1',
'C1=CN=C([NX3])O1', '[#6X4]([#6X4](F)(F)(F))[NX3]', 'C(C#N)=NO',
'N1N=NN=C1', 'N1N=NC=[N]1', 'N1N=NC=[CH1]1', 'C1=NOC=N1', 'C1=NN=C[OX2]1',
'C1=NN=C[NH1]1', '[C]1(COC1)[OX2]', '[C]1(COC1)[NX3]',
'C([NX3H1])=N[CX4H3]', '[CX4]1([CX4][CX4]1)[NX3]',
'[#16X4+2]([NX3])([OX1-])([OX1-])', '[#16X4]([NX3])(=[OX1])(=[OX1])',
'[CX3](=[CX3])(F)', '[OX2][CX3](=[OX1])', '[NX3][CX3](=[OX1])']

How would you best "disconnect the amide moiety" of a structure and
"replace" it with the recursive smarts pattern of its bioisosters? The
bioster matching patern above must be contained within a single smarts, so
I think I need recursive smarts, right?

Any directions would be very much appreciated.

Best,

Alexis
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss