Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Dan Haiduc Thu, 30 Jan 2014 12:05:10 -0800

Actually, I wanted to create exactly this myself.
I was then discouraged by the fact that Scikit-learn did not pull from a
guy who implemented Multi-Armed
Bandit<https://github.com/scikit-learn/scikit-learn/pull/906>on the
reason that Scikit-learn doesn't do reinforcement learning.
I'm new here (everywhere, not just scikit), and I'm not sure how closely
related MAB is with Bayesian optimization, but I think something along
those lines should definitely be implemented for hyperparameters, since
they're expensive functions almost by definition.


Great idea! I certainly wish it gets implemented as well.


On Thu, Jan 30, 2014 at 9:23 PM, James Jensen <jdjen...@eng.ucsd.edu> wrote:

> I usually hesitate to suggest a new feature in a library like this
> unless I am in a position to work on it myself. However, given the
> number of people who seem eager to find something to contribute, and
> given the recent discussion about improving the Gaussian process module,
> I thought I'd venture an idea.
>
> Bayesian optimization is an efficient method used especially for
> functions that are expensive to evaluate. The basic idea is to fit the
> function using Gaussian processes, using a surrogate function that
> determines where to evaluate next in each iteration. The surrogate
> strikes a balance between exploration (sampling intervals you haven't
> tried before) and exploitation (if previous samples in a vicinity scored
> well, then the likelihood of getting a high score in that area is high).
> Some of the math behind it is beyond me, but the general idea is very
> intuitive. Brochu, Cora, and de Freitas (2010) "A Tutorial on Bayesian
> Optimization of Expensive Cost Functions," is a good introduction.
>
> One useful application of Bayesian optimization is hyperparameter
> tuning. It can be used to optimize the cross-validation score, as an
> alternative to, for example, grid search. Grid search is simple and
> parallelizable, there is no overhead in choosing the hyperparameters to
> try, and the nature of some estimators allows them to be used with it
> very efficiently. Bayesian optimization is serial and has a small amount
> of overhead in evaluating the surrogate. But it is generally much more
> efficient in finding good solutions, and particularly shines when the
> scoring function is costly or when there are more than 1 or 2
> hyperparameters to tune; here grid search is less attractive and
> sometimes completely impractical.
>
> In one of my own applications, involving 4 regularization parameters,
> I've been using the BayesOpt library
> (http://rmcantin.bitbucket.org/html/index.html), which offers it as a
> general-purpose optimization technique that one can manually integrate
> with one's cross-validation code. In general, it works quite well, but
> there are some limitations to its design that can make its integration
> inconvenient. Having this functionality directly integrated into
> scikit-learn and specifically tailored to hyperparameter tuning would be
> useful. I have been impressed with the ease of use of such convenience
> classes as GridSearchCV, and dream of having a corresponding BayesOptCV,
> etc.
>
> As a general-use optimization method, Bayesian optimization would belong
> elsewhere than in scikit-learn, e.g. in scipy.optimize. But specifically
> as a method for hyperparameter tuning, it seems it would fit well in the
> scope of scikit-learn, especially since I expect it would not be much
> more than a layer or two of functionality on top of what scikit-learn's
> GP module offers (or will offer once revised). And it would be of more
> general utility than an additional estimator here or there.
>
> I'm curious to hear what others think about the idea. Would this be a
> good fit for scikit-learn? Do we have people with the interest,
> expertise, and time to take this on at some point?
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Reply via email to