Hi Alexandre, I haven't been checking my email and I heard about your message last night from a slightly drunken Gramfort, Grisel, Pinto and Poilvert in French in a loud bar here in Cambridge. Thanks for the PR :)
I think there are some findings on this topic that would be good and appropriate for scikits, and easy to do. 1. random sampling should generally be used instead of grid search. They may feel similar, but theoretically and empirically, sampling from a hypercube parameter space will typically work better than iterating over the points of a grid lattice for hyper-parameter optimization. For some response functions the lattice can be slightly more efficient, but risks being terribly inefficient. So if you have to pick one, pick uniform sampling. 2. Gaussian process w. Expected Improvement global optimization. This is an established technique for global optimization that has about the right scaling properties to be good for hyper-parameter optimization. I think you probably can't do much better than a Gaussian Process (GP) with Expected Improvement (EI) for optimizing the parameters of say, an SVM, but we can only try and see (and compare with the variety of other techniques for global optimization). The scikit already has GP fitting in it, scipy has good optimization routines, so why not put them together to make a hyper-parameter optimizer? I think this would be a good addition to the scikit, and not too hard (the hard parts are already done). - James On Mon, Nov 14, 2011 at 10:06 PM, Alexandre Passos <[email protected]> wrote: > Hello, scikiters, > > Recent work by James Bergstra demonstrated that careful hyperparameter > optimization, as well as careless random sampling, is often better > than manual searching for many problems. You can see results in the > following nips paper: > http://people.fas.harvard.edu/~bergstra/files/pub/11_nips_hyperopt.pdf > > I wonder if there's interest in adding some simple versions of these > techniques to the scikit's very useful GridSearchCV? There is code > available https://github.com/jaberg/hyperopt but it seems to be > research code and it uses theano, so it's not applicable to the > scikit. > > This could be a nice sprint project for someone. > -- > - Alexandre > > ------------------------------------------------------------------------------ > RSA(R) Conference 2012 > Save $700 by Nov 18 > Register now > http://p.sf.net/sfu/rsa-sfdev2dev1 > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
