Hi Pat, Do you have a small example file to proceed , or can I use esol.csv for example ?
Thanks Guillaume De : Patrick Walters <wpwalt...@gmail.com> Date : lundi, 22 mars 2021 à 13:51 À : rdkit-discuss <rdkit-discuss@lists.sourceforge.net> Objet : [*External*] Re: [Rdkit-discuss] Using the RDKit with Dask Apologies, there was a bug in the code I sent in my previous message. The problem is the same. Here is the corrected code in a gist. https://gist.github.com/PatWalters/ca41289a6990ebf7af1e5c44e188fccd On Mon, Mar 22, 2021 at 8:16 AM Patrick Walters <wpwalt...@gmail.com<mailto:wpwalt...@gmail.com>> wrote: Hi All, I've been trying to calculate BCUT2D descriptors in parallel with Dask and get this error with the code below. TypeError: cannot pickle 'Boost.Python.function' object Everything works if I call mw_df, which calculates molecular weight, but I get the error above if I call bcut_df. Does anyone have a workaround? Thanks, Pat #!/usr/bin/env python import sys import dask.dataframe as dd import pandas as pd from rdkit import Chem from rdkit.Chem.Descriptors import MolWt from rdkit.Chem.rdMolDescriptors import BCUT2D import time # -- molecular weight functions def calc_mw(smi): mol = Chem.MolFromSmiles(smi) return MolWt(mol) def mw_df(df): return df.SMILES.apply(calc_mw) # -- bcut functions def bcut_df(df): return df.apply(calc_bcut) def calc_bcut(smi): mol = Chem.MolFromSmiles(smi) return BCUT2D(mol) def main(): start = time.time() df = pd.read_csv(sys.argv[1],sep=" ",names=["SMILES","Name"]) ddf = dd.from_pandas(df,npartitions=16) ddf['MW'] = ddf.map_partitions(mw_df,meta='float').compute(scheduler='processes') ddf['BCUT'] = ddf.map_partitions(bcut_df,meta='float').compute(scheduler='processes') print(time.time()-start) print(ddf.head()) if __name__ == "__main__": main() *********************************************************************************** DISCLAIMER This email and any files transmitted with it, including replies and forwarded copies (which may contain alterations) subsequently transmitted from Firmenich, are confidential and solely for the use of the intended recipient. The contents do not represent the opinion of Firmenich except to the extent that it relates to their official business. ***********************************************************************************
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss