I have a set of feature vectors associated with binary class labels,
each of which has about 40,000 features. I can train a random forest
classifier in sklearn which works well. I would however like to see
the most important features.

I tried simply printing out forest.feature_importances_ but this takes
about 1 second per feature making about 40,000 seconds overall. This
is much much longer than the time needed to train the classifier in
the first place?

Is there a more efficient way to find out which features are most important?

Raphael

On 21 July 2016 at 15:58, Nelson Liu <[email protected]> wrote:
> Hi,
> If I remember correctly, scikit-learn.org is hosted on GitHub Pages (so the
> maintainers don't have control over downtime and issues like the one you're
> having). Can you connect to GitHub, or any site on GitHub Pages?
>
> Thanks
> Nelson
>
> On Thu, Jul 21, 2016, 07:52 Rahul Ahuja <[email protected]> wrote:
>>
>> Hi there,
>>
>>
>> Sklearn website has been down for couple of days. Please look into it.
>>
>>
>> I reside in Pakistan, Karachi city.
>>
>>
>>
>>
>>
>>
>> Kind regards,
>> Rahul Ahuja
>> _______________________________________________
>> scikit-learn mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/scikit-learn
>
_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to