It's a problem of excessive memory consumption due to a O(# possible
parameter settings) approach to sampling from discrete parameter grids
without replacement.
The fix was merged into master only hours ago. Please feel free to work
with master, or to cherry-pick febefb0
On 25 June 2015 at 16:22, Jason Sanchez <jason.sanchez.m...@statefarm.com>
wrote:
> This code that uses RandomizedSearchCV works fine in 0.15.2:
>
> import pandas as pd
> from sklearn.pipeline import Pipeline
> from sklearn.datasets import load_iris
> from sklearn.ensemble import RandomForestClassifier
> from sklearn.grid_search import RandomizedSearchCV
>
> iris = load_iris()
> X = iris.data
> y = iris.target
>
> pipeline = Pipeline([("rf", RandomForestClassifier())])
>
> params = { "rf__n_estimators": range(10,50),
> "rf__max_depth": range(5,10),
> "rf__max_features": range(1, 5),
> "rf__min_samples_split": range(5,101),
> "rf__min_samples_leaf": range(20,50),
> "rf__max_leaf_nodes": range(200, 350)}
>
> random_search = RandomizedSearchCV(pipeline, params).fit(X, y)
>
>
> It does not work in 0.16.1. When I kill the process, here is the Traceback:
> ---------------------------------------------------------------------------
> KeyboardInterrupt Traceback (most recent call last)
> <ipython-input-108-8794e7d30469> in <module>()
> 24 random_search = RandomizedSearchCV(pipeline, params,
> n_iter=n_iter_search, cv=2, refit=True, n_jobs=1)
> 25
> ---> 26 random_search.fit(X_iris, y_iris)
>
> /.../lib/python2.7/site-packages/sklearn/grid_search.pyc in fit(self, X, y)
> 896 self.n_iter,
> 897
> random_state=self.random_state)
> --> 898 return self._fit(X, y, sampled_params)
>
> /.../lib/python2.7/site-packages/sklearn/grid_search.pyc in _fit(self, X,
> y, parameter_iterable)
> 503 self.fit_params,
> return_parameters=True,
> 504 error_score=self.error_score)
> --> 505 for parameters in parameter_iterable
> 506 for train, test in cv)
> 507
>
> /.../lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in
> __call__(self, iterable)
> 656 os.environ[JOBLIB_SPAWNED_PROCESS] = '1'
> 657 self._iterating = True
> --> 658 for function, args, kwargs in iterable:
> 659 self.dispatch(function, args, kwargs)
> 660
>
> /.../lib/python2.7/site-packages/sklearn/grid_search.pyc in
> <genexpr>(***failed resolving arguments***)
> 499 pre_dispatch=pre_dispatch
> 500 )(
> --> 501 delayed(_fit_and_score)(clone(base_estimator), X, y,
> self.scorer_,
> 502 train, test, self.verbose,
> parameters,
> 503 self.fit_params,
> return_parameters=True,
>
> /.../lib/python2.7/site-packages/sklearn/grid_search.pyc in __iter__(self)
> 180 if all_lists:
> 181 # get complete grid and yield from it
> --> 182 param_grid =
> list(ParameterGrid(self.param_distributions))
> 183 grid_size = len(param_grid)
> 184
>
> /.../lib/python2.7/site-packages/sklearn/grid_search.pyc in __iter__(self)
> 100 keys, values = zip(*items)
> 101 for v in product(*values):
> --> 102 params = dict(zip(keys, v))
> 103 yield params
> 104
>
> KeyboardInterrupt:
>
>
> Any thoughts?
>
>
> ------------------------------------------------------------------------------
> Monitor 25 network devices or servers for free with OpManager!
> OpManager is web-based network management software that monitors
> network devices and physical & virtual servers, alerts via email & sms
> for fault. Monitor 25 devices for free with no restriction. Download now
> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors
network devices and physical & virtual servers, alerts via email & sms
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general