Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Hadayat Seddiqi Thu, 30 Jan 2014 12:14:10 -0800

Hi,

So I was the one who volunteered to do contribute my GP code for a revamp
of scikits module. I'm far from an expert, and I can't say I understand how
this would fit off the top of my head, but if someone is knowledgeable and
willing to work on this then I'd be more than happy to lend a hand as well.
I've been kind of quiet on my own GP code so far.. just trying to get
everything as ready and nice as I can before bugging people again.


James you mentioned that you might be hesitant to suggest things if you
don't have time to implement. If I read that correctly, you're saying you
might not have the time, but in case you do, feel free to contact (this
goes for anyone, of course).

-Had



On Thu, Jan 30, 2014 at 3:03 PM, Dan Haiduc <danuthai...@gmail.com> wrote:

> Actually, I wanted to create exactly this myself.
> I was then discouraged by the fact that Scikit-learn did not pull from a
> guy who implemented Multi-Armed 
> Bandit<https://github.com/scikit-learn/scikit-learn/pull/906>on the reason 
> that Scikit-learn doesn't do reinforcement learning.
> I'm new here (everywhere, not just scikit), and I'm not sure how closely
> related MAB is with Bayesian optimization, but I think something along
> those lines should definitely be implemented for hyperparameters, since
> they're expensive functions almost by definition.
>
> Great idea! I certainly wish it gets implemented as well.
>
>
> On Thu, Jan 30, 2014 at 9:23 PM, James Jensen <jdjen...@eng.ucsd.edu>wrote:
>
>> I usually hesitate to suggest a new feature in a library like this
>> unless I am in a position to work on it myself. However, given the
>> number of people who seem eager to find something to contribute, and
>> given the recent discussion about improving the Gaussian process module,
>> I thought I'd venture an idea.
>>
>> Bayesian optimization is an efficient method used especially for
>> functions that are expensive to evaluate. The basic idea is to fit the
>> function using Gaussian processes, using a surrogate function that
>> determines where to evaluate next in each iteration. The surrogate
>> strikes a balance between exploration (sampling intervals you haven't
>> tried before) and exploitation (if previous samples in a vicinity scored
>> well, then the likelihood of getting a high score in that area is high).
>> Some of the math behind it is beyond me, but the general idea is very
>> intuitive. Brochu, Cora, and de Freitas (2010) "A Tutorial on Bayesian
>> Optimization of Expensive Cost Functions," is a good introduction.
>>
>> One useful application of Bayesian optimization is hyperparameter
>> tuning. It can be used to optimize the cross-validation score, as an
>> alternative to, for example, grid search. Grid search is simple and
>> parallelizable, there is no overhead in choosing the hyperparameters to
>> try, and the nature of some estimators allows them to be used with it
>> very efficiently. Bayesian optimization is serial and has a small amount
>> of overhead in evaluating the surrogate. But it is generally much more
>> efficient in finding good solutions, and particularly shines when the
>> scoring function is costly or when there are more than 1 or 2
>> hyperparameters to tune; here grid search is less attractive and
>> sometimes completely impractical.
>>
>> In one of my own applications, involving 4 regularization parameters,
>> I've been using the BayesOpt library
>> (http://rmcantin.bitbucket.org/html/index.html), which offers it as a
>> general-purpose optimization technique that one can manually integrate
>> with one's cross-validation code. In general, it works quite well, but
>> there are some limitations to its design that can make its integration
>> inconvenient. Having this functionality directly integrated into
>> scikit-learn and specifically tailored to hyperparameter tuning would be
>> useful. I have been impressed with the ease of use of such convenience
>> classes as GridSearchCV, and dream of having a corresponding BayesOptCV,
>> etc.
>>
>> As a general-use optimization method, Bayesian optimization would belong
>> elsewhere than in scikit-learn, e.g. in scipy.optimize. But specifically
>> as a method for hyperparameter tuning, it seems it would fit well in the
>> scope of scikit-learn, especially since I expect it would not be much
>> more than a layer or two of functionality on top of what scikit-learn's
>> GP module offers (or will offer once revised). And it would be of more
>> general utility than an additional estimator here or there.
>>
>> I'm curious to hear what others think about the idea. Would this be a
>> good fit for scikit-learn? Do we have people with the interest,
>> expertise, and time to take this on at some point?
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> WatchGuard Dimension instantly turns raw network data into actionable
>> security intelligence. It gives you real-time visual feedback on key
>> security issues and trends.  Skip the complicated setup - simply import
>> a virtual appliance and go from zero to informed in seconds.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Reply via email to