Hi Carsten Your code has two problems: 1. [#4] doesn't mean the number of substitutions, [#0] is an indication for the "non=atom" in SMILES (shown as *) 2. You didn't set the keyword replaceAll to True. Doing this will replace all matches found. Therefore, for your code to work properly, replace: product_mol = Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#4]'),chainMol) by: product_mol = Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#0]'),chainMol, replaceAll=True)
On Mon, Jul 4, 2022 at 5:58 PM Carsten Bauer <carsten.ba...@bluewin.ch> wrote: > Hello > > I want to enumerate a simple molecule having 4 substituents R with a list > of ca. 100 SMILES. > For reasons of simply synthesis, in each enumeration of R, the R should be > the same in all four positions (no cross permutation). > There is no reaction that covers all 100 SMILES. > > I followed > https://www.rdkit.org/docs/Cookbook.html#sidechain-core-enumeration and > modified the code proposed by Earnshaw et al. accordingly: > > core = Chem.MolFromSmiles( > '[*]C(C=C1)=CC=C1C(C2=CC=C([*])C=C2)C(C3=CC=C([*])C=C3)C4=CC=C([*])C=C4') > chains = ['C','CC','CCC','CCCC','CCCCC','CCCCCC'] > chainMols = [Chem.MolFromSmiles(chain) for chain in chains] > > product_smi = [] > for chainMol in chainMols: > product_mol = Chem.ReplaceSubstructs(core,Chem.MolFromSmarts('[#4]'), > chainMol) > product_smi.append(Chem.MolToSmiles(product_mol[0])) > print(product_smi) > > which results in > ['*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', > '*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', > '*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', > '*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', > '*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1', > '*c1ccc(C(c2ccc(*)cc2)C(c2ccc(*)cc2)c2ccc(*)cc2)cc1’] > > This is six times the same compound with no enumeration. > > Python beginner here. Can anybody tell me what the mistake is or where I > can find an example in the literature, please? > > Many thanks > C. > > > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > -- *Rafael da Fonseca Lameiro * PhD Student - Medicinal and Biological Chemistry Group (NEQUIMED) São Carlos Institute of Chemistry - University of São Paulo - Brazil [image: orcid logo 16px] https://orcid.org/0000-0003-4466-2682
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss