Thanks Rafael. Best, Ming
On Mon, May 16, 2022 at 2:27 AM Rafael L <rafael.lame...@usp.br> wrote: > To find all compounds that match the ester substructure, you can use > GetSubstructMatches. I would do the following (I'm supposing you have all > your structures in a Pandas dataframe, and that you converted SMILES to > RDKit Mol): > > ester_pattern = Chem.MolFromSmarts("COC(C)=O") > > # in a pandas dataframe with a column containing your structures as RDKit > Mol objects > df["is_ester"] = df["rdkit_mol"].apply(lambda x: > bool(x.GetSubstructMatches(ester_pattern))) > > This will give you a column with 0s and 1s that you can use as a mask. Of > course, there are other ways to do this, like using a for loop. > Now, to replace the esters by the disulfide group, if you can't manage to > work with reaction SMARTS, you could try using Python strings' replace() > method on SMILES. I believe esters can be represented in two ways (left to > right and right to left), so keep that in mind. You can always use > GetSubstructMatches later to see if any ester was left behind. > Regards. > > On Sun, May 15, 2022 at 1:33 AM Ming Hao <haom.ni...@gmail.com> wrote: > >> Hi All, >> >> I want to replace the ester structure ('COC(C)=O') with disulfide ('CSSC') >> >> [image: image.png] >> >> Here is what I did, but it does not work. It seems to need specified >> methods to replace the original structure with the new one, not just put >> individual SMILES there. >> >> ############################################################## >> from rdkit import Chem >> from rdkit.Chem import AllChem, Draw >> from rdkit.Chem.Draw import IPythonConsole >> >> orgsmi = 'CCOC(=O)CCCCCN(CC)CCCCCCCC(=O)OC(C)CC' >> m = Chem.MolFromSmiles(orgsmi) >> m >> >> pat = Chem.MolFromSmiles('COC(C)=O') >> pat >> >> rep = Chem.MolFromSmiles('CSSC') >> rep >> >> new = AllChem.ReplaceSubstructs(m, pat, rep) >> new[0] # The structure was separated >> new[1] # The structure was separated >> len(new) >> ################################################################# >> >> Can you help me with this? By the way, I have 10K structures, and first I >> need to find the compounds with the pattern (ester, COC(C)=O) and replace >> them with disulfide ('CSSC'). What is a good way to do this? >> >> Thanks. >> Ming >> _______________________________________________ >> Rdkit-discuss mailing list >> Rdkit-discuss@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >> > > > -- > *Rafael da Fonseca Lameiro * > PhD Student - Medicinal and Biological Chemistry Group (NEQUIMED) > São Carlos Institute of Chemistry - University of São Paulo - Brazil > [image: orcid logo 16px] https://orcid.org/0000-0003-4466-2682 >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss