Dear Om, Sorry for the slow reply; this ended up getting lost in my inbox.
On Mon, Feb 18, 2013 at 7:58 PM, Om <[email protected]> wrote: > > I am trying to re-shuffle the side-chain(s) identified after replacing core > (MCS between two mols) from molecules. The two molecules I am considering > are BMS-193884 and irbesartan. Both molecules share biphenyl framework and > act on different target (antagonist of ETA and AT1 respectively). The > approach I have taken is to replace side-chain(s) of BMS-193884 with the > side-chain(s) of irbesartan and vice-versa using ReplaceSubstructs to > generate new molecules keeping MCS as a core part. > > I have pasted minimal code below this mail. I do not know whether this would > be correct approach or not? As long as you don't have any "extra" dummy atoms in your molecule, this should work fine. You should be sure you understand why it's working though, because it's somewhat fragile: ReplaceSubstructs() forms bonds between the first atom in the replacement that correspond to those in the first atom of the thing being replaced. A small example: In [2]: m = Chem.MolFromSmiles('c1ccccc1NCC') In [3]: p = Chem.MolFromSmarts('NCC') In [4]: r = Chem.MolFromSmiles('CCO') In [5]: Chem.MolToSmiles(Chem.ReplaceSubstructs(m,p,r)[0],True) Out[5]: 'OCCc1ccccc1' In [6]: r = Chem.MolFromSmiles('OCC') In [7]: Chem.MolToSmiles(Chem.ReplaceSubstructs(m,p,r)[0],True) Out[7]: 'CCOc1ccccc1' You example is working because the canonical SMILES algorithm is making the dummy atoms the first atoms in the SMILES, so when you remove those from the SMILES the attachment point ends up being first. Somewhat fragile, but it ought to work. There is a more robust version of the template expansion in the RDKit distribution in the package $RDBASE/rdkit/Chem/ChemUtils/TemplateExpand.py that you might want to take a look at. That's a command-line tool, but there are useful functions inside for what you want to do. This is a common and useful pattern, so I probably should add a function to make this easy and then some documentation about how to use it. > Moreover, I am getting dummy atoms in some of the output molecules. > <snip> > > Output: > > CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2nn[nH]n2)cc1 > Cc1noc(NS(=O)(=O)c2ccccc2-c2ccc(-c3nn[nH]n3)cc2)c1C > CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2nn[nH]n2)cc1 > CCCCC1=NC2(CCCC2)C(=O)N1[*]c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1 > CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2ncco2)cc1 > CCCCC1=NC2(CCCC2)C(=O)N1[*]c1ccccc1-c1ccc(-c2ncco2)cc1 > CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1 > c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1 > I am not able to reproduce this. When I run your example I get the following: c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1 CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2-c2ncco2)cc1 CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1 Cc1noc(NS(=O)(=O)c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)c1C CCCCC1=NC2(CCCC2)C(=O)N1c1ccccc1-c1ccc(-c2ncco2)cc1 CCCCC1=NC2(CCCC2)C(=O)N1c1ccc(-c2ccccc2S(=O)(=O)Nc2onc(C)c2C)cc1 Cc1noc(NS(=O)(=O)c2ccccc2-c2ccc(-c3nn[nH]n3)cc2)c1C c1coc(-c2ccc(-c3ccccc3-c3nn[nH]n3)cc2)n1 which seems right to me. Which version of the RDKit are you using? -greg ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Rdkit-discuss mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

