Re: [Scikit-learn-general] Hyperparameter optimization

James Bergstra Tue, 19 Feb 2013 14:36:59 -0800

Further to this: I started a project on github to look at how to
combine hyperopt with sklearn.
https://github.com/jaberg/hyperopt-sklearn


I've only wrapped on algorithm so far: Perceptron
https://github.com/jaberg/hyperopt-sklearn/blob/master/hpsklearn/perceptron.py

My idea is that little files like perceptron.py would encode
(a) domain expertise about what values make sense for a particular
hyper-parameter (see the `search_space()` function and
(b) a sklearn-style fit/predict interface that encapsulates search
over those hyper-parameters (see `AutoPerceptron`)

I just wrote it up today and I've only tried it on one data set, but
at least on Iris it improves the default Perceptron's performance to
85% accuracy from 70%. Better than nothing! Of course it takes 100
times as long when hyperopt is run serially, but .05 seconds and 5
seconds are both pretty quick. (And who would have thought that the
Perceptron would have 8 hyper-parameters??)

I'm not planning to do any more work on this in the very short term,
so if anyone is curious to adapt the Perceptron example to other
algorithms, send PRs :)

- James

On Mon, Feb 11, 2013 at 4:10 PM, James Bergstra
<james.bergs...@gmail.com> wrote:
> Interesting to see this thread revived! FYI I've made hyperopt a lot
> friendlier since that original posting.
>
> http://jaberg.github.com/hyperopt/
>
> pip install hyperopt
>
> 1. It has docs.
> 2. The minimization interface is based on an fmin() function, that
> should be pretty accessible.
> 3. It can be installed straight from pypi
> 4. It just depends on numpy, scipy, and networkx. (optional pymongo and nose)
>
> Adding new algorithms to it (SMBO based on GPs and regression trees)
> is work in progress. The current non-trivial algorithm that's in there
> (TPE) is probably relatively good for high-dimensional spaces, but for
> lower-dimensional search spaces I think these other algos might be
> more efficient. I'll keep the list posted on how that comes along (or
> feel free to get in touch if you'd like to help out.)
>
> - James
>
> On Tue, Dec 6, 2011 at 10:36 AM, James Bergstra
> <james.bergs...@gmail.com> wrote:
>> On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <alexandre...@gmail.com> 
>> wrote:
>>> On Mon, Dec 5, 2011 at 16:26, James Bergstra <james.bergs...@gmail.com> 
>>> wrote:
>>>>
>>>> This is definitely a good idea. I think randomly sampling is still
>>>> useful though. It is not hard to get into settings where the grid is
>>>> in theory very large and the user has a budget that is a tiny fraction
>>>> of the full grid.
>>>
>>> I'd like to implement this, but I'm stuck on a nice way of specifying
>>> distributions over each axis (i.e., sometimes you want to sample
>>> across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes
>>> you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious
>>> and readable and flexible.
>>
>> This is essentially why the algorithms in my "hyperopt" project [1]
>> are implemented as they are. They work for a variety of kinds of
>> distributions (uniform, log-uniform, normal, log-normal, randint),
>> including what I call "conditional" ones. For example, suppose you're
>> trying to optimize all the elements of a learning pipeline, and even
>> the choice of elements.  You only want to pick the PCA pre-processing
>> parameters *if* you're actually doing PCA, because otherwise your
>> parameter optimization algorithm might attribute the score (result /
>> performance) to the PCA parameter choices that you know very well were
>> irrelevant.
>>
>> hyperopt implementations are relatively tricky, but at this point I
>> don't think they could be done in a straightforward simple way that
>> would make them scikit-learn compatible.  I think scikit-learn users
>> would be better served by specific hand-written hyper-parameter
>> optimizers for certain specific, particularly useful pipelines.  Other
>> customized pipelines can use grid search, random search, manual
>> search, or the docs could maybe refer them to hyperopt, as it matures.
>>
>> - James
>>
>> [1]  https://github.com/jaberg/hyperopt

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to