Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

James Jensen Thu, 30 Jan 2014 13:26:34 -0800

Hi, Had,

It's true that I'd have limited time (working on a PhD). I imagine mostpossible contributors are also quite busy. Mainly, I lack the expertisenecessary to do this properly; I understand Bayesian optimization at ahigh level but don't have much of a foundation in the underlying math,and am an amateur programmer not yet accustomed to writing code thatwould meet scikit-learn standards. That being said, if there are ways Ican help make this happen, I'd be glad to do so.


-James

On 01/30/2014 12:11 PM, Hadayat Seddiqi wrote:

Hi,

So I was the one who volunteered to do contribute my GP code for arevamp of scikits module. I'm far from an expert, and I can't say Iunderstand how this would fit off the top of my head, but if someoneis knowledgeable and willing to work on this then I'd be more thanhappy to lend a hand as well. I've been kind of quiet on my own GPcode so far.. just trying to get everything as ready and nice as I canbefore bugging people again.

James you mentioned that you might be hesitant to suggest things ifyou don't have time to implement. If I read that correctly, you'resaying you might not have the time, but in case you do, feel free tocontact (this goes for anyone, of course).


-Had

On Thu, Jan 30, 2014 at 3:03 PM, Dan Haiduc <danuthai...@gmail.com<mailto:danuthai...@gmail.com>> wrote:


    Actually, I wanted to create exactly this myself.
    I was then discouraged by the fact that Scikit-learn did not pull
    from a guy who implemented Multi-Armed Bandit
    <https://github.com/scikit-learn/scikit-learn/pull/906> on the
    reason that Scikit-learn doesn't do reinforcement learning.
    I'm new here (everywhere, not just scikit), and I'm not sure how
    closely related MAB is with Bayesian optimization, but I think
    something along those lines should definitely be implemented for
    hyperparameters, since they're expensive functions almost by
    definition.

    Great idea! I certainly wish it gets implemented as well.


    On Thu, Jan 30, 2014 at 9:23 PM, James Jensen
    <jdjen...@eng.ucsd.edu <mailto:jdjen...@eng.ucsd.edu>> wrote:

        I usually hesitate to suggest a new feature in a library like this
        unless I am in a position to work on it myself. However, given the
        number of people who seem eager to find something to
        contribute, and
        given the recent discussion about improving the Gaussian
        process module,
        I thought I'd venture an idea.

        Bayesian optimization is an efficient method used especially for
        functions that are expensive to evaluate. The basic idea is to
        fit the
        function using Gaussian processes, using a surrogate function that
        determines where to evaluate next in each iteration. The surrogate
        strikes a balance between exploration (sampling intervals you
        haven't
        tried before) and exploitation (if previous samples in a
        vicinity scored
        well, then the likelihood of getting a high score in that area
        is high).
        Some of the math behind it is beyond me, but the general idea
        is very
        intuitive. Brochu, Cora, and de Freitas (2010) "A Tutorial on
        Bayesian
        Optimization of Expensive Cost Functions," is a good introduction.

        One useful application of Bayesian optimization is hyperparameter
        tuning. It can be used to optimize the cross-validation score,
        as an
        alternative to, for example, grid search. Grid search is
        simple and
        parallelizable, there is no overhead in choosing the
        hyperparameters to
        try, and the nature of some estimators allows them to be used
        with it
        very efficiently. Bayesian optimization is serial and has a
        small amount
        of overhead in evaluating the surrogate. But it is generally
        much more
        efficient in finding good solutions, and particularly shines
        when the
        scoring function is costly or when there are more than 1 or 2
        hyperparameters to tune; here grid search is less attractive and
        sometimes completely impractical.

        In one of my own applications, involving 4 regularization
        parameters,
        I've been using the BayesOpt library
        (http://rmcantin.bitbucket.org/html/index.html), which offers
        it as a
        general-purpose optimization technique that one can manually
        integrate
        with one's cross-validation code. In general, it works quite
        well, but
        there are some limitations to its design that can make its
        integration
        inconvenient. Having this functionality directly integrated into
        scikit-learn and specifically tailored to hyperparameter
        tuning would be
        useful. I have been impressed with the ease of use of such
        convenience
        classes as GridSearchCV, and dream of having a corresponding
        BayesOptCV,
        etc.

        As a general-use optimization method, Bayesian optimization
        would belong
        elsewhere than in scikit-learn, e.g. in scipy.optimize. But
        specifically
        as a method for hyperparameter tuning, it seems it would fit
        well in the
        scope of scikit-learn, especially since I expect it would not
        be much
        more than a layer or two of functionality on top of what
        scikit-learn's
        GP module offers (or will offer once revised). And it would be
        of more
        general utility than an additional estimator here or there.

        I'm curious to hear what others think about the idea. Would
        this be a
        good fit for scikit-learn? Do we have people with the interest,
        expertise, and time to take this on at some point?





        
------------------------------------------------------------------------------
        WatchGuard Dimension instantly turns raw network data into
        actionable
        security intelligence. It gives you real-time visual feedback
        on key
        security issues and trends.  Skip the complicated setup -
        simply import
        a virtual appliance and go from zero to informed in seconds.
        
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
        _______________________________________________
        Scikit-learn-general mailing list
        Scikit-learn-general@lists.sourceforge.net
        <mailto:Scikit-learn-general@lists.sourceforge.net>
        https://lists.sourceforge.net/lists/listinfo/scikit-learn-general



    
------------------------------------------------------------------------------
    WatchGuard Dimension instantly turns raw network data into actionable
    security intelligence. It gives you real-time visual feedback on key
    security issues and trends.  Skip the complicated setup - simply
    import
    a virtual appliance and go from zero to informed in seconds.
    http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Bayesian optimization for hyperparameter tuning

Reply via email to