The problem was that I had a loop like for i in xrange(len(clf.feature_importances_)): print clf.feature_importances_[i]
which recomputes the feature importance array in every step. Obvious in hindsight. Raphael On 21 July 2016 at 16:22, Raphael C <drr...@gmail.com> wrote: > I have a set of feature vectors associated with binary class labels, > each of which has about 40,000 features. I can train a random forest > classifier in sklearn which works well. I would however like to see > the most important features. > > I tried simply printing out forest.feature_importances_ but this takes > about 1 second per feature making about 40,000 seconds overall. This > is much much longer than the time needed to train the classifier in > the first place? > > Is there a more efficient way to find out which features are most important? > > Raphael > > On 21 July 2016 at 15:58, Nelson Liu <nf...@uw.edu> wrote: >> Hi, >> If I remember correctly, scikit-learn.org is hosted on GitHub Pages (so the >> maintainers don't have control over downtime and issues like the one you're >> having). Can you connect to GitHub, or any site on GitHub Pages? >> >> Thanks >> Nelson >> >> On Thu, Jul 21, 2016, 07:52 Rahul Ahuja <rahul.ah...@live.com> wrote: >>> >>> Hi there, >>> >>> >>> Sklearn website has been down for couple of days. Please look into it. >>> >>> >>> I reside in Pakistan, Karachi city. >>> >>> >>> >>> >>> >>> >>> Kind regards, >>> Rahul Ahuja >>> _______________________________________________ >>> scikit-learn mailing list >>> scikit-learn@python.org >>> https://mail.python.org/mailman/listinfo/scikit-learn >> >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn