Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Zach Dwiel Thu, 30 Jan 2014 12:18:11 -0800

It seems that with GridSearchCV and RandomizedSearchCV both already
included in scikit-learn, it would make sense to also include other common,
more efficient hyperparameter searchers as well.


zach


On Thu, Jan 30, 2014 at 3:11 PM, Hadayat Seddiqi <had...@gmail.com> wrote:

> Hi,
>
> So I was the one who volunteered to do contribute my GP code for a revamp
> of scikits module. I'm far from an expert, and I can't say I understand how
> this would fit off the top of my head, but if someone is knowledgeable and
> willing to work on this then I'd be more than happy to lend a hand as well.
> I've been kind of quiet on my own GP code so far.. just trying to get
> everything as ready and nice as I can before bugging people again.
>
> James you mentioned that you might be hesitant to suggest things if you
> don't have time to implement. If I read that correctly, you're saying you
> might not have the time, but in case you do, feel free to contact (this
> goes for anyone, of course).
>
> -Had
>
>
>
> On Thu, Jan 30, 2014 at 3:03 PM, Dan Haiduc <danuthai...@gmail.com> wrote:
>
>> Actually, I wanted to create exactly this myself.
>> I was then discouraged by the fact that Scikit-learn did not pull from a
>> guy who implemented Multi-Armed 
>> Bandit<https://github.com/scikit-learn/scikit-learn/pull/906>on the reason 
>> that Scikit-learn doesn't do reinforcement learning.
>> I'm new here (everywhere, not just scikit), and I'm not sure how closely
>> related MAB is with Bayesian optimization, but I think something along
>> those lines should definitely be implemented for hyperparameters, since
>> they're expensive functions almost by definition.
>>
>> Great idea! I certainly wish it gets implemented as well.
>>
>>
>> On Thu, Jan 30, 2014 at 9:23 PM, James Jensen <jdjen...@eng.ucsd.edu>wrote:
>>
>>> I usually hesitate to suggest a new feature in a library like this
>>> unless I am in a position to work on it myself. However, given the
>>> number of people who seem eager to find something to contribute, and
>>> given the recent discussion about improving the Gaussian process module,
>>> I thought I'd venture an idea.
>>>
>>> Bayesian optimization is an efficient method used especially for
>>> functions that are expensive to evaluate. The basic idea is to fit the
>>> function using Gaussian processes, using a surrogate function that
>>> determines where to evaluate next in each iteration. The surrogate
>>> strikes a balance between exploration (sampling intervals you haven't
>>> tried before) and exploitation (if previous samples in a vicinity scored
>>> well, then the likelihood of getting a high score in that area is high).
>>> Some of the math behind it is beyond me, but the general idea is very
>>> intuitive. Brochu, Cora, and de Freitas (2010) "A Tutorial on Bayesian
>>> Optimization of Expensive Cost Functions," is a good introduction.
>>>
>>> One useful application of Bayesian optimization is hyperparameter
>>> tuning. It can be used to optimize the cross-validation score, as an
>>> alternative to, for example, grid search. Grid search is simple and
>>> parallelizable, there is no overhead in choosing the hyperparameters to
>>> try, and the nature of some estimators allows them to be used with it
>>> very efficiently. Bayesian optimization is serial and has a small amount
>>> of overhead in evaluating the surrogate. But it is generally much more
>>> efficient in finding good solutions, and particularly shines when the
>>> scoring function is costly or when there are more than 1 or 2
>>> hyperparameters to tune; here grid search is less attractive and
>>> sometimes completely impractical.
>>>
>>> In one of my own applications, involving 4 regularization parameters,
>>> I've been using the BayesOpt library
>>> (http://rmcantin.bitbucket.org/html/index.html), which offers it as a
>>> general-purpose optimization technique that one can manually integrate
>>> with one's cross-validation code. In general, it works quite well, but
>>> there are some limitations to its design that can make its integration
>>> inconvenient. Having this functionality directly integrated into
>>> scikit-learn and specifically tailored to hyperparameter tuning would be
>>> useful. I have been impressed with the ease of use of such convenience
>>> classes as GridSearchCV, and dream of having a corresponding BayesOptCV,
>>> etc.
>>>
>>> As a general-use optimization method, Bayesian optimization would belong
>>> elsewhere than in scikit-learn, e.g. in scipy.optimize. But specifically
>>> as a method for hyperparameter tuning, it seems it would fit well in the
>>> scope of scikit-learn, especially since I expect it would not be much
>>> more than a layer or two of functionality on top of what scikit-learn's
>>> GP module offers (or will offer once revised). And it would be of more
>>> general utility than an additional estimator here or there.
>>>
>>> I'm curious to hear what others think about the idea. Would this be a
>>> good fit for scikit-learn? Do we have people with the interest,
>>> expertise, and time to take this on at some point?
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> WatchGuard Dimension instantly turns raw network data into actionable
>>> security intelligence. It gives you real-time visual feedback on key
>>> security issues and trends.  Skip the complicated setup - simply import
>>> a virtual appliance and go from zero to informed in seconds.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> WatchGuard Dimension instantly turns raw network data into actionable
>> security intelligence. It gives you real-time visual feedback on key
>> security issues and trends.  Skip the complicated setup - simply import
>> a virtual appliance and go from zero to informed in seconds.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> WatchGuard Dimension instantly turns raw network data into actionable
> security intelligence. It gives you real-time visual feedback on key
> security issues and trends.  Skip the complicated setup - simply import
> a virtual appliance and go from zero to informed in seconds.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Reply via email to