Hi Jeff,

 

There's a lot f people here with way more experience than me, so this may
not be the optimal solution... But here is what I would do in this case:

 

from rdkit import Chem, DataStructs

from rdkit.Chem import Draw, PandasTools, Descriptors, rdMolDescriptors

from IPython.display import HTML

 

def load_sdf_file(file,source,id_column):

    """

    Reads molecules from an SDF file keeping only molecules 

    with valid SMILES, and assign a source field

    """

    df = PandasTools.LoadSDF(file)

    df['Source'] = source

    df['ID'] = df[id_column]

    df['SMILES'] = df['ROMol'].apply(Chem.MolToSmiles)

    df['LogP']   = df['ROMol'].apply(Chem.Descriptors.MolLogP)

    df['MolWt']  = df['ROMol'].apply(Chem.Descriptors.MolWt)

    df['LipinskyHBA'] =
df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBA)

    df['LipinskyHBD'] =
df['ROMol'].apply(Chem.rdMolDescriptors.CalcNumLipinskiHBD)

    

    df =
df[['Source','ID','SMILES','LogP','MolWt','LipinskyHBA','LipinskyHBD','ROMol
']]

    return df

 

df = load_sdf_file("chembl-26_phase-1.sdf","ChEMBL_Phase-1","ID")

df.head() #Should show the top of the DataFrame, with the properties and the
structures.

 

 

All the best,

--

Gustavo Seabra

 

-----Original Message-----
From: Jeff Saxon <jmsstarli...@gmail.com> 
Sent: Tuesday, December 1, 2020 7:35 AM
To: rdkit-discuss@lists.sourceforge.net
Subject: [Rdkit-discuss] Applying Lipinsky filter on ligand data set

 

Dear All,

 

I've just started working with RDKIT focusing on the application of the
Lipinsky rule on the set of my ligands. Basically I take a 3D coordinates of
each ligand file (in SDF format) and then calculate for it required 4
properties Here is my code:

# make a list of all .sdf filles present in data folder:

    dirlist = [os.path.basename(p) for p in glob.glob('data' + '/*.sdf')]

 

    # create empty data file with 5 columns:

    # name of the file,  value of variable p, value of ac, value of don,
value of wt

    df = pd.DataFrame(columns=["key", "p", "ac", "don", "wt"])

 

    # for each sdf file get its name and calculate 4 different

properties: p, ac, don, wt

for sdf in dirlist:

sdf_name=sdf.rsplit( ".", 1 )[ 0 ]

key = f'{sdf_name}'

mol = open(sdf,'rb')

m = Chem.ForwardSDMolSupplier(mol)

for conf in m:

if conf is None: continue

p = MolLogP(conf) # coeff conc-perm

ac = CalcNumLipinskiHBA(conf)#

don = CalcNumLipinskiHBD(conf)

wt = MolWt(conf)

#two=AllChem.Compute2DCoords(conf)

Draw.MolToFile(conf,results+f'/{key}.png')

#df[key] = [p, ac, don, wt]

 

Could you suggest how can I summarize the calculation of each ligand in
pandas-like DF and to then apply lipinsky filter on it?

Is it possible to convert 3D coordinates to 2D in order that I could draw it
(presently it makes a sketch based on 3d coordinates directly from SDF)?

 

 

_______________________________________________

Rdkit-discuss mailing list

 <mailto:Rdkit-discuss@lists.sourceforge.net>
Rdkit-discuss@lists.sourceforge.net

 <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to