On Thu, Jan 30, 2014 at 11:23:28AM -0800, James Jensen wrote:
> Bayesian optimization is an efficient method used especially for 
> functions that are expensive to evaluate. The basic idea is to fit the 
> function using Gaussian processes, using a surrogate function that 
> determines where to evaluate next in each iteration. The surrogate 
> strikes a balance between exploration (sampling intervals you haven't 
> tried before) and exploitation (if previous samples in a vicinity scored 
> well, then the likelihood of getting a high score in that area is high). 
> Some of the math behind it is beyond me, but the general idea is very 
> intuitive. Brochu, Cora, and de Freitas (2010) "A Tutorial on Bayesian 
> Optimization of Expensive Cost Functions," is a good introduction.

> One useful application of Bayesian optimization is hyperparameter 
> tuning.

Thanks a lot for your enthousiasme and suggestion.

Indeed, many of the core developpers would love to see simple Bayesian
optimization used for hyperparameter optimization, for instance taking
the gist of hyperopt https://github.com/hyperopt/hyperopt and making an
extended version of the RandomSearchCV.

However there are a number of technical roadblocks to get there. In
particular the Gaussian process could be improved (to implement
partial_fit for online learning), and the parallel computing engine
(joblib) does not support well as producer/consumer pattern. None of
these problems are showstoppers, but they reduce the usefulness of a
hyper-parameter selection object using Bayesian optimization.

I would hope that we find time to implement these difficult core aspects
and eventually get to implementing a more advanced hyper-parameter
optimizer. But all the core developers are very busy and spending a lot
of time simply maintaining the library (have a look at the number of
issues open or pull requests that are waiting to be reviewed to have an
idea).

If you want to help -beyond helping with reviewing/finishing pull
requests and closing issues, I suggest that first, to prototype code, you
could first submit an example using the Gaussian processes to do
optimization of a noisy function. In a second step, after having that
example merged, we could think about how to build a BayesianSearchCV
object.

Cheers,

Gaël

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable 
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to