Hello, This might be trivial python question but I am stuck in calculating the Chem.MolFromSmiles:
My smiles strings are in a pandas DataFrame (df) with SMILES column (df.SMILES) and I have been calculating the mols (df_mol) by Pandas apply function as below: df_mol = df.SMILES.apply( lambda x: Chem.MolFromSmiles(x) ) While this works just fine, I would like to speed up the calculation by vectorization either in pandas or in numpy as below: df_mol = Chem.MolFromSmiles(df.SMILES) df_mol = Chem.MolFromSmiles(df.SMILES.values) However, I am getting the TypeError as below: TypeError: No registered converter was able to produce a C++ rvalue of type std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>, std::__1::allocator<wchar_t> > from this Python object of type Series Just to provide further information on the above approach is to replace the apply function in the snippet below with vectorization: allDescp=[name[0] for name in Descriptors._descList] for name in allDescp: _= MoleculeDescriptors.MolecularDescriptorCalculator([name]) df [name] = df [smiles_column].apply(lambda x: _.CalcDescriptors(Chem.MolFromSmiles(x))[0]) Thanks, Ali
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss