[Scikit-learn-general] RFC on memoized models

Joel Nothman Thu, 13 Feb 2014 01:28:31 -0800

I am seeking comment on
remember_model<https://github.com/jnothman/scikit-learn/commit/de0f86d1efd4b477bd662e15d2b2e78292ad3107#diff-9c87b35a377fa30b1fdb8b121cd46dcdR32>,
which wraps an estimator to memoise its fit() method.


I have conceived of two useful applications, though no doubt there are
others:

   - avoid repeating work when performing grid search over a pipeline, by
   memoizing transformers that are not affected by some varying parameters
   - memoize a grid-searched estimator, so that selected cross-validation
   models can be recalled and inspected without re-fitting.

remember_model uses joblib.Memory to map the estimator's parameters (as in
get_params(deep=True)) and training data to its __dict__ after fitting
(sorry if there are estimators whose state is not entirely represented in
__dict__). Some parameters can be ignored if they don't affect the fit
model (n_jobs, verbose, k in SelectKBest).

The file linked above also includes other wrappers, notably
freeze_model<https://github.com/jnothman/scikit-learn/commit/de0f86d1efd4b477bd662e15d2b2e78292ad3107#diff-9c87b35a377fa30b1fdb8b121cd46dcdR93>,
which could be used for stacking or other circumstances where a model is
pre-trained and shouldn't abide the usual cloning rules. For all these
cases, I have needed to patch sklearn.base.clone to handle class-specific
overrides.

I seek comments on whether and how generic model memoization utilities
might fit into scikit-learn.

Thanks!

- Joel

------------------------------------------------------------------------------
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] RFC on memoized models

Reply via email to