Re: [Scikit-learn-general] Hyperparameter optimization

Alexandre Passos Sat, 03 Dec 2011 07:38:39 -0800

On Sat, Dec 3, 2011 at 10:25, Gael Varoquaux
<[email protected]> wrote:
> On Sat, Dec 03, 2011 at 12:32:59PM +0100, Olivier Grisel wrote:
>> Alexandre has a new blog post about this with simple python snippet
>> using sklearn GuassianProcess:
>
>>   http://atpassos.posterous.com/bayesian-optimization
>
> That's pretty cool. If Alexandre agrees, this code could definitely serve
> as the basis for a scikit-learn implementation: it is simple and
> readable, looks very testable, and brings in the necessary
> functionality.


That was the point of writing that code, actually.

Currently it's in a very bad state for the scikit, as it's far slower
and more limited than it should be, but I plan on cleaning it up
eventually (I'd love to do this at the post-NIPS sprint but personal
life makes it complicated).

The main problems with it right now are:

 0. The initialization is left out of it, and it's actually pretty
important for good performance. A few widely-spaced random samples
from the space of possbilities would be ideal.

 1. Simulated annealing is a pretty naive way of maximizing over the
gaussian process. It starts from a single point and has no knowledge
of where the objective function is good or bad. Something that is
aware of the previous unevaluated points is a better idea. Is there
any implementation of a GA-like optimizer for scipy we could use? We
could also run more than one simulated annealing pass, starting from
many different good points, to better explore the state space.

 2. The simulated annealing code has no way right now of specifying
the boundaries of the state space. This is very bad, as the variance
in Gaussian processes grows the further you go away from the known
points, so naively the simulated annealing will just keep exploring at
infinity and find ridiculously huge upper confidence bounds on the
optimal value.

 3. It has no clear way of dealing with discrete variables or setting
up the kernel of the GP to be something less badly chosen. Tuning the
kernel is easy, but dealing with discrete hyperparameters not so much
(as the simulated annealing code and the kernel would have to be
adapted).


-- 
 - Alexandre

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to