Hi Alexandre, I haven't been checking my email and I heard about your
message last night from a slightly drunken Gramfort, Grisel, Pinto and
Poilvert in French in a loud bar here in Cambridge. Thanks for the PR
:)

I think there are some findings on this topic that would be good and
appropriate for scikits, and easy to do.

1. random sampling should generally be used instead of grid search.
They may feel similar, but theoretically and empirically, sampling
from a hypercube parameter space will typically work better than
iterating over the points of a grid lattice for hyper-parameter
optimization.  For some response functions the lattice can be slightly
more efficient, but risks being terribly inefficient. So if you have
to pick one, pick uniform sampling.

2. Gaussian process w. Expected Improvement global optimization.
This is an established technique for global optimization that has
about the right scaling properties to be good for hyper-parameter
optimization.  I think you probably can't do much better than a
Gaussian Process (GP) with Expected Improvement (EI) for optimizing
the parameters of say, an SVM, but we can only try and see (and
compare with the variety of other techniques for global optimization).
The scikit already has GP fitting in it, scipy has good optimization
routines, so why not put them together to make a hyper-parameter
optimizer? I think this would be a good addition to the scikit, and
not too hard (the hard parts are already done).

- James

On Mon, Nov 14, 2011 at 10:06 PM, Alexandre Passos
<[email protected]> wrote:
> Hello, scikiters,
>
> Recent work by James Bergstra demonstrated that careful hyperparameter
> optimization, as well as careless random sampling, is often better
> than manual searching for many problems. You can see results in the
> following nips paper:
> http://people.fas.harvard.edu/~bergstra/files/pub/11_nips_hyperopt.pdf
>
> I wonder if there's interest in adding some simple versions of these
> techniques to the scikit's very useful GridSearchCV? There is code
> available https://github.com/jaberg/hyperopt but it seems to be
> research code and it uses theano, so it's not applicable to the
> scikit.
>
> This could be a nice sprint project for someone.
> --
>  - Alexandre
>
> ------------------------------------------------------------------------------
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to