Is your aim to use this information for feature selection, or do you
actually want to see which features are being maximally weighted? There's a
SO question that addresses the latter use:
http://stackoverflow.com/questions/6697/how-to-get-most-informative-features-for-scikit-learn-classifiers
Some classifiers (several of the tree based ones - RandomForest,
GradientBoostingRegressorTree) have a clf.feature_importances_ which can be
plotted to show the relative strength of each feature. sklearn.ensemble
also has a module called partial_dependence, which has a function
plot_partial_depende
Dear all,
I was wondering whether there is a sensitivity analyzer inside scikit-learn? I
saw recursive feature elimination, but I would like to see which features are
the most important for classification in my data.
All the best,
Arman
--
2014/1/23 Felipe Eltermann :
> I'm testing different classifiers for a BoW problem and last week I got
> disappointed that I couldn't use scikit's DecisionTree.
> However, using NaiveBayes was awesome! Thanks for this great piece of
> software.
> So, if you are planning to add the support for scipy
I'm testing different classifiers for a BoW problem and last week I got
disappointed that I couldn't use scikit's DecisionTree.
However, using NaiveBayes was awesome! Thanks for this great piece of
software.
So, if you are planning to add the support for scipy sparse matrix on
DecisionTree, I'd lik
> How much code in our current implementation depends on the data
representation?
Not much actually. It now basically boils down to simply write a new
splitter object. Everything else remains the same. So basically, I would
say that it amounts to 300~ lines of Cython (out of the 2300 lines in our
> I will try using sparse data on 20newsgroups data and let you know the
results.
What I was suggesting is to densify the News20 dataset (using a subset of
the features so that it fits in memory) and try it on our current
implementation. Of course it will be really slow but the goal is to
evaluate
> Date: Wed, 22 Jan 2014 17:17:38 +0100
> From: Lars Buitinck
> Subject: Re: [Scikit-learn-general] Strange Error Message
> To: scikit-learn-general
>> $ ./loan-minimal.py
>> Traceback (most recent call last):
>>File "./loan-minimal.py", line 13, in
>> clf.fit(train, loss)
>>File "
2014/1/23 Maheshakya Wijewardena :
> Hi
>
> As I think, using sparse data we can enhance the descriptiveness of the data
> while keeping its' smaller compared to the dense data without loosing
> information.
I don't understand what you mean by "sparse data we can enhance the
descriptiveness of the