First, I think it's important to think about if the combination makes sense:

E.g., I think it wouldn't make much sense to combine PCA and kernel SVM, since 
PCA is a linear transformation technique (scikit-learn implements so non-linear 
dim reduction techniques, too). Also, if the size of the data is not an issue, 
I'd rather prefer regularization for linear SVM because it is in some sense 
"supervised".

From a technical standpoint, I think most estimators should support the RFE 
(except e.g., K-nearest neighbors) or so.


> On Apr 28, 2015, at 4:54 PM, Pagliari, Roberto <rpagli...@appcomsci.com> 
> wrote:
> 
> Thank you! one more question. When it comes to pipelining with grid search, 
> which estimators can I use for feature selection, apart from SVC and PCA?
> 
> Thank you, 
> 
> From: Artem [barmaley....@gmail.com <mailto:barmaley....@gmail.com>]
> Sent: Tuesday, April 28, 2015 4:07 PM
> To: scikit-learn-general
> Subject: Re: [Scikit-learn-general] error with RFE and gridsearchCV
> 
> ​GridSearchCV is not a​n estimator, but an "utility" to find one. So you 
> should `fit` grid search first in order to find that classifier that performs 
> well on cv-splits, and then use it. Like this
> 
>     gbr = GradientBoostingClassifier()
>     parameters = {'learning_rate': [0.1, 0.01, 0.001],
>                   'max_depth': [1, 4, 6],
>                   'min_samples_leaf': [3, 5, 9, 17],
>                   'max_features': [1.0, 0.3, 0.1]}
>     clf = grid_search.GridSearchCV(estimator=gbr, param_grid=parameters, 
> n_jobs=16)
>     clf ​.fit(x_train, y_train)
>     rfecv = RFECV(estimator=clf.best_estimator_, step=1, cv=10, 
> scoring='accuracy')
>     rfecv.fit(x_train, y_train)
> 
>     # prediction
>     y_predicted = rfecv.estimator_.predict(x_test)
> 
> Also ​, note that RFECV 
> <http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFECV.html>
>  only supports models that have coef_ ​ ​attribute, and 
> GradientBoostingClassifier does not.
> 
> On Tue, Apr 28, 2015 at 8:44 PM, Pagliari, Roberto <rpagli...@appcomsci.com 
> <mailto:rpagli...@appcomsci.com>> wrote:
> I'm trying to use recursive feature elimination with gradient boosting and 
> grid search as shown below
> 
> 
>     gbr = GradientBoostingClassifier()
>     parameters = {'learning_rate': [0.1, 0.01, 0.001],
>                   'max_depth': [1, 4, 6],
>                   'min_samples_leaf': [3, 5, 9, 17],
>                   'max_features': [1.0, 0.3, 0.1]}
>     clf = grid_search.GridSearchCV(estimator=gbr, param_grid=parameters, 
> n_jobs=16)
>     rfecv = RFECV(estimator=clf, step=1, cv=10, scoring='accuracy')
>     rfecv.fit(x_train, y_train)
> 
>     # prediction
>     y_predicted = rfecv.estimator_.predict(x_test)
> 
> However, I'm getting this error and I don't know how to fix it:
> 
> Traceback (most recent call last):
>   File "./gbr_rfe.py", line 92, in <module>
>     rfecv.fit(x_train, y_train)
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/feature_selection/rfe.py", 
> line 376, in fit
>     ranking_ = rfe.fit(X_train, y_train).ranking_
>   File 
> "/usr/local/lib/python2.7/dist-packages/sklearn/feature_selection/rfe.py", 
> line 163, in fit
>     if estimator.coef_.ndim > 1:
> AttributeError: 'GridSearchCV' object has no attribute 'coef_'
> 
> 
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y 
> <http://ad.doubleclick.net/ddm/clk/290420510;117567292;y>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net 
> <mailto:Scikit-learn-general@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
> 
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________
>  
> <http://ad.doubleclick.net/ddm/clk/290420510;117567292;y_______________________________________________>
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net 
> <mailto:Scikit-learn-general@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to