Cool, thanks for feedback!

Any outstanding PRs addressing something like this or anyone on this list
been thinking of/working on solutions?
I imagine it might be implemented as a step in a pipeline (eg.
FeatureRemover()) and be generally applicable / potentially benefit many
sklearners. Not sure if it could be compatible with HashingVectorizer though


On Mon, Mar 14, 2016 at 7:20 PM, Joel Nothman <joel.noth...@gmail.com>
wrote:

> Currently there is no automatic mechanism for eliminating the generation
> of features that are not selected downstream. It needs to be achieved
> manually.
>
> On 15 March 2016 at 08:05, Philip Tully <tu...@csc.kth.se> wrote:
>
>> Hi,
>>
>> I'm trying to optimize the time it takes to make a prediction with my
>> model(s). I realized that when I perform feature selection during the
>> model fit(), that these features are likely still computed when I go
>> to predict() or predict_proba(). An optimization would then involve
>> actually eliminating those features that aren't selected from my
>> Pipeline altogether, instead of just selecting them.
>>
>> Does sklearn already do this automatically? Or does this readjustment
>> need to be done manually before serialization?
>>
>> thanks,
>> Philip
>>
>>
>> ------------------------------------------------------------------------------
>> Transform Data into Opportunity.
>> Accelerate data analysis in your applications with
>> Intel Data Analytics Acceleration Library.
>> Click to learn more.
>> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Find and fix application performance issues faster with Applications Manager
Applications Manager provides deep performance insights into multiple tiers of
your business applications. It resolves application problems quickly and
reduces your MTTR. Get your free trial!
https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to