That is like a one-step look-ahead feature selection?
I guess that could be done, but has a much higher complexity than RFE.
RFE works for anything that returns "importances", not just linear models.
It doesn't really work for KNN, as you say. [I wouldn't say 
non-parametric models. Trees are pretty non-parametric].

It seems interesting. Is that really used in practice and is there any 
literature evaluating it?
There is some discussion here 
http://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf in 4.2
but there is no empirical comparison or theoretical analysis.

To be added to sklearn, you'd need to show that it is widely used and / 
or widely useful.


On 04/27/2015 02:47 PM, Sebastian Raschka wrote:
> Hi, I was wondering if sequential feature selection algorithms are currently 
> implemented in scikit-learn. The closest that I could find was recursive 
> feature elimination (RFE); 
> http://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.RFE.html.
>  However, unless the application requires a fixed number of features, I am 
> not sure if it is necessarily worthwhile using it over regularized models. 
> When I understand correctly, it works like this:
>
> {x1, x2, x3} --> eliminate xi with smallest corresponding weight
>
> {x1, x3} --> eliminate xi with smallest corresponding weight
>
> {x1}
>
> However, this would only work with linear, discriminative models right?
>
> Wouldn't be a classic "sequential feature selection" algorithm useful for 
> non-regularized, nonparametric models e.g,. K-nearest neighbors as an 
> alternative to dimensionality reduction for applications where the original 
> features may need to be maintained? The RFE, for example, wouldn't work with 
> KNN, and maybe the data is non-linearly separable so that RFE with a linear 
> model doesn't make sense.
>
> In a nutshell, SFS algorithms simply add or remove one feature at the time 
> based on the classifier performance.
>
> e.g., Sequential backward selection:
>
> {x1, x2, x3} ---> estimate performance on {x1, x2}, {x2, x3} and {x1, x3}, 
> and pick the subset with the best performance
> {x1, x3} ---> estimate performance on {x1}, {x3} and pick the subset with the 
> best performance
> {x1}
>
> where performance could be e.g., cross-val accuracy.
>
> What do you think?
>
> Best,
> Sebastian
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to