Dear Ali,

Please run first the following code, which may help you:

```python
import numpy as np
np.argsort(rfregress.feature_importances_)[::-1]
```

The `argsort` will return the indexes of the important features in
ascending order and [::-1] reverses the order.
The indexes for feature importance must correspond to the order of
variables (or the order in 'allDescp' of your code), so use these
variables, you'll get the information that you want.

Sincerely yours,
Shojiro


On Tue, 21 Aug 2018 at 10:34, Ali Eftekhari <a.b.eftekh...@gmail.com> wrote:

> Hello rdkit,
>
> This might be trivial but I am beginner and don't know how to do it.
>
> I am building a simple model to predict target property.  I have pandas
> dataframe (df) whose columns are 'SMILES' and 'Target'.
>
> #calculating the descriptors as below:
> llDescp=[name[0] for name in Descriptors._descList]
> calc=MoleculeDescriptors.MolecularDescriptorCalculator(allDescp)
> df ['fp']=df['SMILES'].apply(lambda x:
> calc.CalcDescriptors(Chem.MolFromSmiles(x)))
>
> #converting  the fingerprint to numpy array
> y=df['Target'].values
> X=np.array(list(df['fp']))
>
> #preprocessing
> X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.25,
> random_state=42)
> st=StandardScaler()
> X=st.fit_transform(X)
>
> #random forest model
> model=RandomForestRegressor(n_estimators=10)
> model.fit(X_train, y_train)
>
> My problem is that I don't know how to get the meaningful
> feature_importance.  The following will return the values of descriptors
> but there is no labels and so I don't know how to figure out which features
> are important.
>
> print (sorted (rfregress.feature_importances_))
>
> Thanks for your help!
>
>
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>


-- 
----
The University of Tokyo
2nd year Ph.D. candidate
  Shojiro Shibayama
----
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to