Re: [Scikit-learn-general] Sensitivity analysis

2014-01-23 Thread Fred Mailhot
Is your aim to use this information for feature selection, or do you actually want to see which features are being maximally weighted? There's a SO question that addresses the latter use: http://stackoverflow.com/questions/6697/how-to-get-most-informative-features-for-scikit-learn-classifiers

Re: [Scikit-learn-general] Sensitivity analysis

2014-01-23 Thread Kyle Kastner
Some classifiers (several of the tree based ones - RandomForest, GradientBoostingRegressorTree) have a clf.feature_importances_ which can be plotted to show the relative strength of each feature. sklearn.ensemble also has a module called partial_dependence, which has a function plot_partial_depende

[Scikit-learn-general] Sensitivity analysis

2014-01-23 Thread Arman Eshaghi
Dear all, I was wondering whether there is a sensitivity analyzer inside scikit-learn? I saw recursive feature elimination, but I would like to see which features are the most important for classification in my data. All the best, Arman --

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-01-23 Thread Olivier Grisel
2014/1/23 Felipe Eltermann : > I'm testing different classifiers for a BoW problem and last week I got > disappointed that I couldn't use scikit's DecisionTree. > However, using NaiveBayes was awesome! Thanks for this great piece of > software. > So, if you are planning to add the support for scipy

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-01-23 Thread Felipe Eltermann
I'm testing different classifiers for a BoW problem and last week I got disappointed that I couldn't use scikit's DecisionTree. However, using NaiveBayes was awesome! Thanks for this great piece of software. So, if you are planning to add the support for scipy sparse matrix on DecisionTree, I'd lik

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-01-23 Thread Gilles Louppe
> How much code in our current implementation depends on the data representation? Not much actually. It now basically boils down to simply write a new splitter object. Everything else remains the same. So basically, I would say that it amounts to 300~ lines of Cython (out of the 2300 lines in our

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-01-23 Thread Mathieu Blondel
> I will try using sparse data on 20newsgroups data and let you know the results. What I was suggesting is to densify the News20 dataset (using a subset of the features so that it fits in memory) and try it on our current implementation. Of course it will be really slow but the goal is to evaluate

Re: [Scikit-learn-general] Strange Error Message

2014-01-23 Thread Lorenzo Isella
> Date: Wed, 22 Jan 2014 17:17:38 +0100 > From: Lars Buitinck > Subject: Re: [Scikit-learn-general] Strange Error Message > To: scikit-learn-general >> $ ./loan-minimal.py >> Traceback (most recent call last): >>File "./loan-minimal.py", line 13, in >> clf.fit(train, loss) >>File "

Re: [Scikit-learn-general] Sparse matrix support for Decision tree implementation

2014-01-23 Thread Olivier Grisel
2014/1/23 Maheshakya Wijewardena : > Hi > > As I think, using sparse data we can enhance the descriptiveness of the data > while keeping its' smaller compared to the dense data without loosing > information. I don't understand what you mean by "sparse data we can enhance the descriptiveness of the