Hello,

This might be trivial python question but I am stuck in calculating
the  Chem.MolFromSmiles:

My smiles strings are in a pandas DataFrame (df) with SMILES column
(df.SMILES) and I have been calculating the mols (df_mol) by Pandas apply
function as below:
df_mol = df.SMILES.apply( lambda x: Chem.MolFromSmiles(x) )

While this works just fine, I would like to speed up the calculation by
vectorization either in pandas or in numpy as below:
df_mol = Chem.MolFromSmiles(df.SMILES)
df_mol = Chem.MolFromSmiles(df.SMILES.values)

However, I am getting the TypeError as below:
TypeError: No registered converter was able to produce a C++ rvalue of type
std::__1::basic_string<wchar_t, std::__1::char_traits<wchar_t>,
std::__1::allocator<wchar_t> > from this Python object of type Series

Just to provide further information on the above approach is to replace the
apply function in the snippet below with vectorization:

allDescp=[name[0] for name in Descriptors._descList]
    for name in allDescp:
        _= MoleculeDescriptors.MolecularDescriptorCalculator([name])
        df [name] = df [smiles_column].apply(lambda x:
_.CalcDescriptors(Chem.MolFromSmiles(x))[0])

Thanks,
Ali
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to