Re: [Scikit-learn-general] Hyperparameter optimization

James Jong Tue, 19 Feb 2013 16:30:16 -0800

Hi there,

I presume some of you may have already seen this, but if not, caret in R is
a nice example of how to do model selection with a unified interface to a
variety of class & reg. methods:


http://caret.r-forge.r-project.org/

James


On Tue, Feb 19, 2013 at 7:12 PM, James Bergstra <james.bergs...@gmail.com>wrote:

> I should add: if anyone has thoughts about the design, I'm interested
> to get your input. Easier to redesign things now, before more code is
> written.
>
> - James
>
> On Tue, Feb 19, 2013 at 5:36 PM, James Bergstra
> <james.bergs...@gmail.com> wrote:
> > Further to this: I started a project on github to look at how to
> > combine hyperopt with sklearn.
> > https://github.com/jaberg/hyperopt-sklearn
> >
> > I've only wrapped on algorithm so far: Perceptron
> >
> https://github.com/jaberg/hyperopt-sklearn/blob/master/hpsklearn/perceptron.py
> >
> > My idea is that little files like perceptron.py would encode
> > (a) domain expertise about what values make sense for a particular
> > hyper-parameter (see the `search_space()` function and
> > (b) a sklearn-style fit/predict interface that encapsulates search
> > over those hyper-parameters (see `AutoPerceptron`)
> >
> > I just wrote it up today and I've only tried it on one data set, but
> > at least on Iris it improves the default Perceptron's performance to
> > 85% accuracy from 70%. Better than nothing! Of course it takes 100
> > times as long when hyperopt is run serially, but .05 seconds and 5
> > seconds are both pretty quick. (And who would have thought that the
> > Perceptron would have 8 hyper-parameters??)
> >
> > I'm not planning to do any more work on this in the very short term,
> > so if anyone is curious to adapt the Perceptron example to other
> > algorithms, send PRs :)
> >
> > - James
> >
> > On Mon, Feb 11, 2013 at 4:10 PM, James Bergstra
> > <james.bergs...@gmail.com> wrote:
> >> Interesting to see this thread revived! FYI I've made hyperopt a lot
> >> friendlier since that original posting.
> >>
> >> http://jaberg.github.com/hyperopt/
> >>
> >> pip install hyperopt
> >>
> >> 1. It has docs.
> >> 2. The minimization interface is based on an fmin() function, that
> >> should be pretty accessible.
> >> 3. It can be installed straight from pypi
> >> 4. It just depends on numpy, scipy, and networkx. (optional pymongo and
> nose)
> >>
> >> Adding new algorithms to it (SMBO based on GPs and regression trees)
> >> is work in progress. The current non-trivial algorithm that's in there
> >> (TPE) is probably relatively good for high-dimensional spaces, but for
> >> lower-dimensional search spaces I think these other algos might be
> >> more efficient. I'll keep the list posted on how that comes along (or
> >> feel free to get in touch if you'd like to help out.)
> >>
> >> - James
> >>
> >> On Tue, Dec 6, 2011 at 10:36 AM, James Bergstra
> >> <james.bergs...@gmail.com> wrote:
> >>> On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <
> alexandre...@gmail.com> wrote:
> >>>> On Mon, Dec 5, 2011 at 16:26, James Bergstra <
> james.bergs...@gmail.com> wrote:
> >>>>>
> >>>>> This is definitely a good idea. I think randomly sampling is still
> >>>>> useful though. It is not hard to get into settings where the grid is
> >>>>> in theory very large and the user has a budget that is a tiny
> fraction
> >>>>> of the full grid.
> >>>>
> >>>> I'd like to implement this, but I'm stuck on a nice way of specifying
> >>>> distributions over each axis (i.e., sometimes you want to sample
> >>>> across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes
> >>>> you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious
> >>>> and readable and flexible.
> >>>
> >>> This is essentially why the algorithms in my "hyperopt" project [1]
> >>> are implemented as they are. They work for a variety of kinds of
> >>> distributions (uniform, log-uniform, normal, log-normal, randint),
> >>> including what I call "conditional" ones. For example, suppose you're
> >>> trying to optimize all the elements of a learning pipeline, and even
> >>> the choice of elements.  You only want to pick the PCA pre-processing
> >>> parameters *if* you're actually doing PCA, because otherwise your
> >>> parameter optimization algorithm might attribute the score (result /
> >>> performance) to the PCA parameter choices that you know very well were
> >>> irrelevant.
> >>>
> >>> hyperopt implementations are relatively tricky, but at this point I
> >>> don't think they could be done in a straightforward simple way that
> >>> would make them scikit-learn compatible.  I think scikit-learn users
> >>> would be better served by specific hand-written hyper-parameter
> >>> optimizers for certain specific, particularly useful pipelines.  Other
> >>> customized pipelines can use grid search, random search, manual
> >>> search, or the docs could maybe refer them to hyperopt, as it matures.
> >>>
> >>> - James
> >>>
> >>> [1]  https://github.com/jaberg/hyperopt
>
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_feb
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to