Re: [Scikit-learn-general] select number of features to keep

2015-01-29 Thread Pagliari, Roberto
bject: Re: [Scikit-learn-general] select number of features to keep For LinearSVC, see the docs: http://scikit-learn.org/dev/modules/generated/sklearn.svm.LinearSVC.html#sklearn.svm.LinearSVC.transform I don't understand the second part of your question. On 01/29/2015 11:55 AM, Pagliari, Rob

Re: [Scikit-learn-general] select number of features to keep

2015-01-29 Thread Andy
That is a common way to do it, though not the default behavior of LinearSVC IIRC. On 01/29/2015 12:54 PM, Sebastian Raschka wrote: A naive but related question: doesn't the l1 norm allow for 0-coefficients? That would be one way to get rid of the "not so useful" features. On Jan 29, 2015, a

Re: [Scikit-learn-general] select number of features to keep

2015-01-29 Thread Sebastian Raschka
A naive but related question: doesn't the l1 norm allow for 0-coefficients? That would be one way to get rid of the "not so useful" features. > On Jan 29, 2015, at 11:55 AM, Pagliari, Roberto > wrote: > > When using a feature selection algorithm in a pipeline, for example > clf = Pipeline([ >

Re: [Scikit-learn-general] select number of features to keep

2015-01-29 Thread Andy
For LinearSVC, see the docs: http://scikit-learn.org/dev/modules/generated/sklearn.svm.LinearSVC.html#sklearn.svm.LinearSVC.transform I don't understand the second part of your question. On 01/29/2015 11:55 AM, Pagliari, Roberto wrote: When using a feature selection algorithm in a pipeline, fo

[Scikit-learn-general] select number of features to keep

2015-01-29 Thread Pagliari, Roberto
When using a feature selection algorithm in a pipeline, for example clf = Pipeline([ ('feature_selection', LinearSVC(penalty="l1")), ('classification', RandomForestClassifier()) ]) clf.fit(X, y) or even a random forest, for that matter, how does sklearn know how many features to keep? Thank