I have a set of feature vectors associated with binary class labels, each of which has about 40,000 features. I can train a random forest classifier in sklearn which works well. I would however like to see the most important features.
I tried simply printing out forest.feature_importances_ but this takes about 1 second per feature making about 40,000 seconds overall. This is much much longer than the time needed to train the classifier in the first place? Is there a more efficient way to find out which features are most important? Raphael On 21 July 2016 at 15:58, Nelson Liu <[email protected]> wrote: > Hi, > If I remember correctly, scikit-learn.org is hosted on GitHub Pages (so the > maintainers don't have control over downtime and issues like the one you're > having). Can you connect to GitHub, or any site on GitHub Pages? > > Thanks > Nelson > > On Thu, Jul 21, 2016, 07:52 Rahul Ahuja <[email protected]> wrote: >> >> Hi there, >> >> >> Sklearn website has been down for couple of days. Please look into it. >> >> >> I reside in Pakistan, Karachi city. >> >> >> >> >> >> >> Kind regards, >> Rahul Ahuja >> _______________________________________________ >> scikit-learn mailing list >> [email protected] >> https://mail.python.org/mailman/listinfo/scikit-learn > > > _______________________________________________ > scikit-learn mailing list > [email protected] > https://mail.python.org/mailman/listinfo/scikit-learn > _______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
