Re: [Rdkit-discuss] mass replacement of External R-groups with many substituents

2017-03-16 Thread Greg Landrum
Combining Steve's and Chris' answer gets to how I would do it with
reactions:

In [17]: core = Chem.MolFromSmiles('*c1c(C)1(O)')

In [18]: chain = Chem.MolFromSmiles('CN*')

In [19]: rxn =
AllChem.ReactionFromSmarts('[c:1][#0].[#0][*:2]>>[c:1]-[*:2]')

In [20]: ps = rxn.RunReactants((core,chain))

In [21]: Chem.MolToSmiles(ps[0][0],isomericSmiles=True)
Out[21]: 'CNc1c(C)1O'

If you can generate your sidechains in such a way that the atom you want to
attach to the core is the first one (atom zero), there's a somewhat easier
way to do this kind of simple replacement:
In [29]: chain2 = Chem.MolFromSmiles('NC')

In [30]: ps2 =
Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#0]'),chain2)

In [31]: Chem.MolToSmiles(ps2[0],isomericSmiles=True)
Out[31]: 'CNc1c(C)1O'

The reaction-based approach is, however, far more flexible.

We've talked about adding a simple function for doing sidechain-core
enumeration but I'm pretty sure it hasn't yet actually been done.

-greg




On Thu, Mar 16, 2017 at 6:16 PM, Stephen Roughley <
s.d.rough...@googlemail.com> wrote:

> You can match a dummy atom (*) with the SMARTS [#0]
>
> Steve
>
> On 16 Mar 2017 16:43, "Chris Earnshaw"  wrote:
>
>> Hi Brian
>>
>> I'm by no means an expert in RDKit with Python, but until someone else
>> comes along, here are a few thoughts.
>>
>> Your reaction SMARTS specifically defines aromatic carbons joined by
>> single bonds which won't match an incoming benzene ring, and it's a bit
>> redundant to specify that aromatic carbons are in rings. It's possibly also
>> neater to specify the ring closure if you want to include the whole ring.
>> Finally, I don't think that you can specify dummy atoms in a reaction
>> SMARTS as you're attempting (it certainly doesn't work for me with R and
>> R1) but you certainly can specify 'placeholder' atoms using real elements.
>> I've found U, V & W handy for this in the past.
>>
>> A small Python script with a modified SMARTS which works for me is -
>> from rdkit import Chem
>> from rdkit.Chem import AllChem
>> rxn = 
>> AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9]
>> >> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1')
>> ps = rxn.RunReactants((Chem.MolFromSmiles('c1c1[U]'),Chem.
>> MolFromSmiles('[V]Cl')))
>> Chem.MolToSmiles(ps[0][0])
>>
>> This returns 'Clc1c1'.
>>
>> Unless you have some specific reason to include the complete benzene
>> ring, the reaction SMARTS can be simplified a lot further, to -
>> [c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4]
>> This will carry out your virtual reaction on any aromatic carbon carrying
>> the appropriate placeholder element.
>>
>> I'd be interested to learn if there is an approved way to mark
>> substitution positions in reactions like this without having to resort to
>> the cheat of using obscure elements.
>>
>> Regards,
>> Chris
>>
>> On 15 March 2017 at 23:20, Bennion, Brian  wrote:
>>
>>> Hello All,
>>>
>>>
>>>
>>> I have looked around the email list and studied the daylight guide as
>>> well as the opensmiles website in hopes of solving my problem.  External
>>> r-groups is a proposed extension but that is all I could find.
>>>
>>>
>>>
>>> It is possible that I have made it too complicated though.
>>>
>>>
>>>
>>> In discussions with my synthetic chemist we came up with 27 substituents
>>> two place around our molecule scaffold.
>>>
>>> I labeled the scaffolds with R1 in the position that the substitution
>>> will occur and attached a dummy label to the substituents.  I thought that
>>> I could do a simple replacement rxn.  However, I have not been successful.
>>>
>>>
>>>
>>> The smarts string so far is listed below.
>>>
>>>
>>>
>>> AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]-
>>> [c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:4
>>> ]-[c;R:5]([*:9])-[c;R:6]')
>>>
>>>
>>>
>>> Basically just adding a group to positions along a benzene ring
>>>
>>>
>>>
>>> Thanks
>>>
>>> Brian Bennion
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 
>>> --
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> ___
>>> Rdkit-discuss mailing list
>>> Rdkit-discuss@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>>
>>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
> 
> --
> Check out the vibrant tech community on one of the 

Re: [Rdkit-discuss] mass replacement of External R-groups with many substituents

2017-03-16 Thread Stephen Roughley
You can match a dummy atom (*) with the SMARTS [#0]

Steve

On 16 Mar 2017 16:43, "Chris Earnshaw"  wrote:

> Hi Brian
>
> I'm by no means an expert in RDKit with Python, but until someone else
> comes along, here are a few thoughts.
>
> Your reaction SMARTS specifically defines aromatic carbons joined by
> single bonds which won't match an incoming benzene ring, and it's a bit
> redundant to specify that aromatic carbons are in rings. It's possibly also
> neater to specify the ring closure if you want to include the whole ring.
> Finally, I don't think that you can specify dummy atoms in a reaction
> SMARTS as you're attempting (it certainly doesn't work for me with R and
> R1) but you certainly can specify 'placeholder' atoms using real elements.
> I've found U, V & W handy for this in the past.
>
> A small Python script with a modified SMARTS which works for me is -
> from rdkit import Chem
> from rdkit.Chem import AllChem
> rxn = 
> AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9]
> >> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1')
> ps = rxn.RunReactants((Chem.MolFromSmiles('c1c1[U]'),
> Chem.MolFromSmiles('[V]Cl')))
> Chem.MolToSmiles(ps[0][0])
>
> This returns 'Clc1c1'.
>
> Unless you have some specific reason to include the complete benzene ring,
> the reaction SMARTS can be simplified a lot further, to -
> [c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4]
> This will carry out your virtual reaction on any aromatic carbon carrying
> the appropriate placeholder element.
>
> I'd be interested to learn if there is an approved way to mark
> substitution positions in reactions like this without having to resort to
> the cheat of using obscure elements.
>
> Regards,
> Chris
>
> On 15 March 2017 at 23:20, Bennion, Brian  wrote:
>
>> Hello All,
>>
>>
>>
>> I have looked around the email list and studied the daylight guide as
>> well as the opensmiles website in hopes of solving my problem.  External
>> r-groups is a proposed extension but that is all I could find.
>>
>>
>>
>> It is possible that I have made it too complicated though.
>>
>>
>>
>> In discussions with my synthetic chemist we came up with 27 substituents
>> two place around our molecule scaffold.
>>
>> I labeled the scaffolds with R1 in the position that the substitution
>> will occur and attached a dummy label to the substituents.  I thought that
>> I could do a simple replacement rxn.  However, I have not been successful.
>>
>>
>>
>> The smarts string so far is listed below.
>>
>>
>>
>> AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]-
>> [c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:4
>> ]-[c;R:5]([*:9])-[c;R:6]')
>>
>>
>>
>> Basically just adding a group to positions along a benzene ring
>>
>>
>>
>> Thanks
>>
>> Brian Bennion
>>
>>
>>
>>
>>
>>
>>
>> 
>> --
>> Check out the vibrant tech community on one of the world's most
>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>> ___
>> Rdkit-discuss mailing list
>> Rdkit-discuss@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>>
>>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] mass replacement of External R-groups with many substituents

2017-03-16 Thread Chris Earnshaw
Hi Brian

I'm by no means an expert in RDKit with Python, but until someone else
comes along, here are a few thoughts.

Your reaction SMARTS specifically defines aromatic carbons joined by single
bonds which won't match an incoming benzene ring, and it's a bit redundant
to specify that aromatic carbons are in rings. It's possibly also neater to
specify the ring closure if you want to include the whole ring. Finally, I
don't think that you can specify dummy atoms in a reaction SMARTS as you're
attempting (it certainly doesn't work for me with R and R1) but you
certainly can specify 'placeholder' atoms using real elements. I've found
U, V & W handy for this in the past.

A small Python script with a modified SMARTS which works for me is -
from rdkit import Chem
from rdkit.Chem import AllChem
rxn =
AllChem.ReactionFromSmarts('[c:1]1[c:2][c:3][c:4][c:5]([U:7])[c:6]1.[V:8]-[*:9]
>> [c:1]1[c:2][c:3][c:4][c:5](-[*:9])[c:6]1')
ps =
rxn.RunReactants((Chem.MolFromSmiles('c1c1[U]'),Chem.MolFromSmiles('[V]Cl')))
Chem.MolToSmiles(ps[0][0])

This returns 'Clc1c1'.

Unless you have some specific reason to include the complete benzene ring,
the reaction SMARTS can be simplified a lot further, to -
[c:1][U:2].[V:3]-[*:4] >> [c:1]-[*:4]
This will carry out your virtual reaction on any aromatic carbon carrying
the appropriate placeholder element.

I'd be interested to learn if there is an approved way to mark substitution
positions in reactions like this without having to resort to the cheat of
using obscure elements.

Regards,
Chris

On 15 March 2017 at 23:20, Bennion, Brian  wrote:

> Hello All,
>
>
>
> I have looked around the email list and studied the daylight guide as well
> as the opensmiles website in hopes of solving my problem.  External
> r-groups is a proposed extension but that is all I could find.
>
>
>
> It is possible that I have made it too complicated though.
>
>
>
> In discussions with my synthetic chemist we came up with 27 substituents
> two place around our molecule scaffold.
>
> I labeled the scaffolds with R1 in the position that the substitution will
> occur and attached a dummy label to the substituents.  I thought that I
> could do a simple replacement rxn.  However, I have not been successful.
>
>
>
> The smarts string so far is listed below.
>
>
>
> AllChem.ReactionFromSmarts('[c;R:1]-[c;R:2]-[c;R:3]-[c;R:4]
> -[c;R:5]([R1;!R:7])-[c;R:6].[R:8]-[*:9] >> [c;R:1]-[c;R:2]-[c;R:3]-[c;R:
> 4]-[c;R:5]([*:9])-[c;R:6]')
>
>
>
> Basically just adding a group to positions along a benzene ring
>
>
>
> Thanks
>
> Brian Bennion
>
>
>
>
>
>
>
> 
> --
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> ___
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss