Dear Om,

Sorry for the slow reply; this ended up getting lost in my inbox.

On Mon, Feb 18, 2013 at 7:58 PM, Om <[email protected]> wrote:
>
> I am trying to re-shuffle the side-chain(s) identified after replacing core
> (MCS between two mols) from molecules. The two molecules I am considering
> are BMS-193884 and irbesartan. Both molecules share biphenyl framework and
> act on different target (antagonist of ETA and AT1 respectively). The
> approach I have taken is to replace side-chain(s) of BMS-193884 with the
> side-chain(s) of irbesartan and vice-versa using ReplaceSubstructs to
> generate new molecules keeping MCS as a core part.
>
> I have pasted minimal code below this mail. I do not know whether this would
> be correct approach or not?

As long as you don't have any "extra" dummy atoms in your molecule,
this should work fine. You should be sure you understand why it's
working though, because it's somewhat fragile: ReplaceSubstructs()
forms bonds between the first atom in the replacement that correspond
to those in the first atom of the thing being replaced.

A small example:
In [2]: m = Chem.MolFromSmiles('c1ccccc1NCC')

In [3]: p = Chem.MolFromSmarts('NCC')

In [4]: r = Chem.MolFromSmiles('CCO')

In [5]: Chem.MolToSmiles(Chem.ReplaceSubstructs(m,p,r)[0],True)
Out[5]: 'OCCc1ccccc1'

In [6]: r = Chem.MolFromSmiles('OCC')

In [7]: Chem.MolToSmiles(Chem.ReplaceSubstructs(m,p,r)[0],True)
Out[7]: 'CCOc1ccccc1'

You example is working because the canonical SMILES algorithm is
making the dummy atoms the first atoms in the SMILES, so when you
remove those from the SMILES the attachment point ends up being first.
Somewhat fragile, but it ought to work.

There is a more robust version of the template expansion in the RDKit
distribution in the package
$RDBASE/rdkit/Chem/ChemUtils/TemplateExpand.py that you might want to
take a look at. That's a command-line tool, but there are useful
functions inside for what you want to do. This is a common and useful
pattern, so I probably should add a function to make this easy and
then some documentation about how to use it.

> Moreover, I am getting dummy atoms in some of the output molecules.
>

<snip>

>
> Output:
>
> CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2nn[nH]n2)cc1
> Cc1noc(NS(=O)(=O)c2ccccc2-c2ccc(-c3nn[nH]n3)cc2)c1C
> CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2nn[nH]n2)cc1
> CCCCC1=NC2(CCCC2)C(=O)N1[*]c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1
> CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2ncco2)cc1
> CCCCC1=NC2(CCCC2)C(=O)N1[*]c1ccccc1-c1ccc(-c2ncco2)cc1
> CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1
> c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1
>

I am not able to reproduce this. When I run your example I get the following:

c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1
CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2ncco2)cc1
CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1
Cc1noc(NS(=O)(=O)c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)c1C
CCCCC1=NC2(CCCC2)C(=O)N1c1ccccc1-c1ccc(-c2ncco2)cc1
CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1
Cc1noc(NS(=O)(=O)c2ccccc2-c2ccc(-c3nn[nH]n3)cc2)c1C
c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1

which seems right to me.

Which version of the RDKit are you using?

-greg

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to