Re: [Rdkit-discuss] back tracking descriptor names from RandomForest feature_importance

2018-08-22 Thread Ali Eftekhari
Hi Shojiro, This might not be the most elegant and efficient way but it worked for what I wanted to do. I changed my apply function as below: allDescp=[name[0] for name in Descriptors._descList] for name in allDescp: temp=MoleculeDescriptors.MolecularDescriptorCalculator([name])

Re: [Rdkit-discuss] back tracking descriptor names from RandomForest feature_importance

2018-08-21 Thread Ali Eftekhari
hi Shojiro, Thanks for your response but print (np.argsort(rfregress.feature_importances_)[::-1]) returns the row indices but what I want is the column names so it can give me information which features are important. On Mon, Aug 20, 2018 at 9:31 PM Shojiro Shibayama wrote: > Dear Ali, > >

Re: [Rdkit-discuss] back tracking descriptor names from RandomForest feature_importance

2018-08-20 Thread Shojiro Shibayama
Dear Ali, Please run first the following code, which may help you: ```python import numpy as np np.argsort(rfregress.feature_importances_)[::-1] ``` The `argsort` will return the indexes of the important features in ascending order and [::-1] reverses the order. The indexes for feature

[Rdkit-discuss] back tracking descriptor names from RandomForest feature_importance

2018-08-20 Thread Ali Eftekhari
Hello rdkit, This might be trivial but I am beginner and don't know how to do it. I am building a simple model to predict target property. I have pandas dataframe (df) whose columns are 'SMILES' and 'Target'. #calculating the descriptors as below: llDescp=[name[0] for name in