A little question regarding how it’s currently handled ...
So, if I have one of scikit-learn’s feature selectors in a pipeline, and it 
selected e.g., the features idx=[1, 12, 23] after “.fit”. Now, if I use 
“.predict" on that pipeline, wouldn’t the feature selectors transform method 
only pass X[:, idx]  (where X is the input array and idx is something like [1, 
12, 23]) to the next object in the pipeline, e.g., the estimator? That’s how I 
do it with my custom feature selection objects/algorithms, never looked under 
the hood of how scikit-learn feature selection implementations do it, so I am 
curious.

Best,
Sebastian


> On May 2, 2016, at 11:06 AM, Philip Tully <tu...@csc.kth.se> wrote:
> 
> Cool, thanks for feedback!
> 
> Any outstanding PRs addressing something like this or anyone on this list 
> been thinking of/working on solutions? 
> I imagine it might be implemented as a step in a pipeline (eg. 
> FeatureRemover()) and be generally applicable / potentially benefit many 
> sklearners. Not sure if it could be compatible with HashingVectorizer though
> 
> 
> On Mon, Mar 14, 2016 at 7:20 PM, Joel Nothman <joel.noth...@gmail.com> wrote:
> Currently there is no automatic mechanism for eliminating the generation of 
> features that are not selected downstream. It needs to be achieved manually.
> 
> On 15 March 2016 at 08:05, Philip Tully <tu...@csc.kth.se> wrote:
> Hi,
> 
> I'm trying to optimize the time it takes to make a prediction with my
> model(s). I realized that when I perform feature selection during the
> model fit(), that these features are likely still computed when I go
> to predict() or predict_proba(). An optimization would then involve
> actually eliminating those features that aren't selected from my
> Pipeline altogether, instead of just selecting them.
> 
> Does sklearn already do this automatically? Or does this readjustment
> need to be done manually before serialization?
> 
> thanks,
> Philip
> 
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ------------------------------------------------------------------------------
> Find and fix application performance issues faster with Applications Manager
> Applications Manager provides deep performance insights into multiple tiers of
> your business applications. It resolves application problems quickly and
> reduces your MTTR. Get your free trial!
> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z_______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to