Hello everyone, I am using RDKit for a while now. My focus is the transformation of molecules in simplified reduced forms. With the help of SMARTS I specify molecular substructures and pattern to transform these parts into pseudoatoms. Afterwards I would like to get the maximum common substructures out of the reduced graphs. These is done to compare molecules and their MCS in small datasets of molecules.
SMILES string: CCC(C)C(C(=O)O)N Isoleucin reduced form: [Zn][Zn][Zn]([Zn])[Zn]([Nb])[Mo] CC(C)CC(C(=O)O)N Leucin reduced form: [Zn][Zn]([Zn])[Zn][Zn]([Nb])[Mo] I would like to know how to group together Zn-Zn-Zn- ... as a single -Zn- atom in the reduced graph. Because these linker atoms (Zn) are only carbon atoms which can be compressed together. The number of the linkers doesn't play a role in the reduced form and gives false results by comparing the reduced graphs one below the other when they have a different length of carbon atoms next to each other. I started simplying my molecules into reduced graph with the following code: #--------------------------------------------------- from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem import Draw print "\n" , "Module erfolgreich importiert" , "\n" def molecule(smiles): mol = Chem.MolFromSmiles(smiles) Draw.MolToFile(mol, 'pictures/molecule.png') file = open('pseudo_negative_ionizable','r') lines = file.readlines() file.close() for line in lines: repl = Chem.MolFromSmarts( line ) pseudo = Chem.MolFromSmarts('[Mo]') mol_new = AllChem.ReplaceSubstructs(mol, repl, pseudo, True) mol_new_smi = Chem.MolToSmiles(mol_new[0]) print mol_new_smi mol = Chem.MolFromSmiles(mol_new_smi) .....definition of every pseudoatom in SMARTS semantic Now, I want instead of many -Zn-Zn- atoms, only one -Zn- atom to represent in my result. Another problem I have is when I transform bigger molecules into reduced forms and they include some ringsystems rdkit functions plit these molecules. The output is a molecule spilt by a dot. Is there any possibility to avoid it and keep the molecule (reduced form) together? Example: O=C(Nc1ccc(-n2ccccc2=O)cn1)C1CC(O)(c2ccccc2Cl)CN1C(=O)Nc1ccc(Cl)cc1 --> reduced form: [Sc].[V].[Co].[Zn].[Zn].[Sc][Co][Ni].[V][Co][Ni][Hf] I am very much looking forward towards your help, Thanks & regards, Jessica ------------------------------------------------------------------------------ Attend Shape: An AT&T Tech Expo July 15-16. Meet us at AT&T Park in San Francisco, CA to explore cutting-edge tech and listen to tech luminaries present their vision of the future. This family event has something for everyone, including kids. Get more information and register today. http://sdm.link/attshape _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss