Re: [Scikit-learn-general] sequential feature selection algorithms

Sebastian Raschka Mon, 27 Apr 2015 15:48:03 -0700

> I guess that could be done, but has a much higher complexity than RFE.


Oh yes, I agree, the sequential feature algorithms are definitely 
computationally more costly. 

> It seems interesting. Is that really used in practice and is there any 
> literature evaluating it?


I am not sure how often it is used in practice nowadays, but I think it is one 
of the classic approaches for feature selection -- I learned about it a couple 
of years ago in a pattern classification class, and there is a relatively 
detailed article in 

Ferri, F., et al. "Comparative study of techniques for large-scale feature 
selection." Pattern Recognition in Practice IV (1994): 403-413.

The optimal solution to feature selection would be to evaluate the performance 
of all possible feature combination, which is a little bit too costly in 
practice. The sequential forward or backward selection (SFS and SBS) algorithms 
are just a suboptimal solution, and there are some minor improvements, e.g,. 
Sequential Floating Forward Selection (SFFS) which allows for the removal of 
added features in later stages etc.

I have an implementation of SBS that uses k-fold cross_val_score, and it is 
actually not a bad idea to use it for KNN to reduce overfitting as alternative 
to dimensionality reduction, for example, KNN cross-val mean accuracy on the 
wine dataset where the features are selected by SBS: 
http://i.imgur.com/ywDTHom.png?1
 
But for scikit-learn, it may be better to implement SBBS or SFFS which is 
slightly more sophisticated.


> On Apr 27, 2015, at 6:00 PM, Andreas Mueller <t3k...@gmail.com> wrote:
> 
> That is like a one-step look-ahead feature selection?
> I guess that could be done, but has a much higher complexity than RFE.
> RFE works for anything that returns "importances", not just linear models.
> It doesn't really work for KNN, as you say. [I wouldn't say 
> non-parametric models. Trees are pretty non-parametric].
> 
> It seems interesting. Is that really used in practice and is there any 
> literature evaluating it?
> There is some discussion here 
> http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf in 4.2
> but there is no empirical comparison or theoretical analysis.
> 
> To be added to sklearn, you'd need to show that it is widely used and / 
> or widely useful.
> 
> 
> On 04/27/2015 02:47 PM, Sebastian Raschka wrote:
>> Hi, I was wondering if sequential feature selection algorithms are currently 
>> implemented in scikit-learn. The closest that I could find was recursive 
>> feature elimination (RFE); 
>> http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html.
>>  However, unless the application requires a fixed number of features, I am 
>> not sure if it is necessarily worthwhile using it over regularized models. 
>> When I understand correctly, it works like this:
>> 
>> {x1, x2, x3} --> eliminate xi with smallest corresponding weight
>> 
>> {x1, x3} --> eliminate xi with smallest corresponding weight
>> 
>> {x1}
>> 
>> However, this would only work with linear, discriminative models right?
>> 
>> Wouldn't be a classic "sequential feature selection" algorithm useful for 
>> non-regularized, nonparametric models e.g,. K-nearest neighbors as an 
>> alternative to dimensionality reduction for applications where the original 
>> features may need to be maintained? The RFE, for example, wouldn't work with 
>> KNN, and maybe the data is non-linearly separable so that RFE with a linear 
>> model doesn't make sense.
>> 
>> In a nutshell, SFS algorithms simply add or remove one feature at the time 
>> based on the classifier performance.
>> 
>> e.g., Sequential backward selection:
>> 
>> {x1, x2, x3} ---> estimate performance on {x1, x2}, {x2, x3} and {x1, x3}, 
>> and pick the subset with the best performance
>> {x1, x3} ---> estimate performance on {x1}, {x3} and pick the subset with 
>> the best performance
>> {x1}
>> 
>> where performance could be e.g., cross-val accuracy.
>> 
>> What do you think?
>> 
>> Best,
>> Sebastian
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] sequential feature selection algorithms

Reply via email to