Re: [Scikit-learn-general] Use SciPy optimization instead of brute force GridSearch

Gael Varoquaux Sun, 07 Apr 2013 22:27:35 -0700

I have made a few attempts in these directions (for instance using a
Nelder Mead optimizer). However, it is quite hard to get an optimizer
that does not get stuck in local minima, given that there is a lot of
noise, and flat regions.


Jame Bergstra has put a lot of intelligence in his HyperOpt. While I do
believe that we can achieve part of the features with a fraction of the
lines of code, it is certainly not a trivial task.

Anyhow, I do encourage people to develop scikit-learn compatible
alternative to the GridSearchCV. If, after trying them out on many
problems, they have good replacements, I am sure that we will be happy to
include them in scikit-learn.

G

On Mon, Apr 08, 2013 at 11:27:28AM +1000, Joel Nothman wrote:
> Currently BaseSearchCV expects a predetermined sequence of parameter settings,
> which is unideal for some cases. SciPy opts for a callback approach. I've not
> used that interface, but I gather something like this might work:

> class MinimizeCV(BaseEstimator):
>     def __init__(self, minimiser, clf, param_init, scoring, cv, 
> minimise_kwargs
> ={}):
>         self.clf = clf
>         self.param_init = param_init
>         self.scoring = scoring
>         self.cv = cv
>         self.minimiser = minimiser
>         self.minimise_kwargs = minimise_kwargs

>     def fit(self, X, y=None):
>         def objective(param_values):
>             """"""
>             # TODO: parallelise fold fitting
>             return aggregate(
>                 fit_grid_point(X, y, self.clf, dict(zip(param_list,
> param_values)), self.scoring, ...)
>                 for train, test in self.cv)
>         res = self.minimiser(objective, X0=[v for k, v in sorted
> (self.param_init.iteritems())], **self.minimise_kwargs)
>         # TODO: store results and perhaps search history

> I think a variant could be implemented that shares most of its code with the
> existing BaseSearchCV.

> I haven't looked at hyperopt's interface yet.

> - Joel

> > On Sun, Apr 7, 2013 at 6:35 PM, Roman Sinayev <[email protected]> wrote:


>     > It seems like brute force Grid Search takes forever when attempting to
>     > determine best parameters with many classifiers.  Let's say the
>     > parameter space looks something like this
>     > http://i.imgur.com/AiBl8Wt.png .  Why not use the SciPy simulated
>     > annealing or some simple genetic algorithm instead of searching
>     > through all the possible parameter space of every classifier?


> ------------------------------------------------------------------------------
> Minimize network downtime and maximize team effectiveness.
> Reduce network management and security costs.Learn how to hire 
> the most talented Cisco Certified professionals. Visit the 
> Employer Resources Portal
> http://www.cisco.com/web/learning/employer_resources/index.html

> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


-- 
    Gael Varoquaux
    Researcher, INRIA Parietal
    Laboratoire de Neuro-Imagerie Assistee par Ordinateur
    NeuroSpin/CEA Saclay , Bat 145, 91191 Gif-sur-Yvette France
    Phone:  ++ 33-1-69-08-79-68
    http://gael-varoquaux.info            http://twitter.com/GaelVaroquaux

------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Use SciPy optimization instead of brute force GridSearch

Reply via email to