Hi Jeff,
There's a lot f people here with way more experience than me, so this may not be the optimal solution... But here is what I would do in this case: from rdkit import Chem, DataStructs from rdkit.Chem import Draw, PandasTools, Descriptors, rdMolDescriptors from IPython.display import HTML def load_sdf_file(file,source,id_column): """ Reads molecules from an SDF file keeping only molecules with valid SMILES, and assign a source field """ df = PandasTools.LoadSDF(file) df['Source'] = source df['ID'] = df[id_column] df['SMILES'] = df['ROMol'].apply(Chem.MolToSmiles) df['LogP'] = df['ROMol'].apply(Chem.Descriptors.MolLogP) df['MolWt'] = df['ROMol'].apply(Chem.Descriptors.MolWt) df['LipinskyHBA'] = df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBA) df['LipinskyHBD'] = df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBD) df = df[['Source','ID','SMILES','LogP','MolWt','LipinskyHBA','LipinskyHBD','ROMol ']] return df df = load_sdf_file("chembl-26_phase-1.sdf","ChEMBL_Phase-1","ID") df.head() #Should show the top of the DataFrame, with the properties and the structures. All the best, -- Gustavo Seabra -----Original Message----- From: Jeff Saxon <jmsstarli...@gmail.com> Sent: Tuesday, December 1, 2020 7:35 AM To: rdkit-discuss@lists.sourceforge.net Subject: [Rdkit-discuss] Applying Lipinsky filter on ligand data set Dear All, I've just started working with RDKIT focusing on the application of the Lipinsky rule on the set of my ligands. Basically I take a 3D coordinates of each ligand file (in SDF format) and then calculate for it required 4 properties Here is my code: # make a list of all .sdf filles present in data folder: dirlist = [os.path.basename(p) for p in glob.glob('data' + '/*.sdf')] # create empty data file with 5 columns: # name of the file, value of variable p, value of ac, value of don, value of wt df = pd.DataFrame(columns=["key", "p", "ac", "don", "wt"]) # for each sdf file get its name and calculate 4 different properties: p, ac, don, wt for sdf in dirlist: sdf_name=sdf.rsplit( ".", 1 )[ 0 ] key = f'{sdf_name}' mol = open(sdf,'rb') m = Chem.ForwardSDMolSupplier(mol) for conf in m: if conf is None: continue p = MolLogP(conf) # coeff conc-perm ac = CalcNumLipinskiHBA(conf)# don = CalcNumLipinskiHBD(conf) wt = MolWt(conf) #two=AllChem.Compute2DCoords(conf) Draw.MolToFile(conf,results+f'/{key}.png') #df[key] = [p, ac, don, wt] Could you suggest how can I summarize the calculation of each ligand in pandas-like DF and to then apply lipinsky filter on it? Is it possible to convert 3D coordinates to 2D in order that I could draw it (presently it makes a sketch based on 3d coordinates directly from SDF)? _______________________________________________ Rdkit-discuss mailing list <mailto:Rdkit-discuss@lists.sourceforge.net> Rdkit-discuss@lists.sourceforge.net <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss