Re: [Scikit-learn-general] Hyperparameter optimization

James Bergstra Tue, 19 Feb 2013 16:13:03 -0800

I should add: if anyone has thoughts about the design, I'm interested
to get your input. Easier to redesign things now, before more code is
written.


- James

On Tue, Feb 19, 2013 at 5:36 PM, James Bergstra
<james.bergs...@gmail.com> wrote:
> Further to this: I started a project on github to look at how to
> combine hyperopt with sklearn.
> https://github.com/jaberg/hyperopt-sklearn
>
> I've only wrapped on algorithm so far: Perceptron
> https://github.com/jaberg/hyperopt-sklearn/blob/master/hpsklearn/perceptron.py
>
> My idea is that little files like perceptron.py would encode
> (a) domain expertise about what values make sense for a particular
> hyper-parameter (see the `search_space()` function and
> (b) a sklearn-style fit/predict interface that encapsulates search
> over those hyper-parameters (see `AutoPerceptron`)
>
> I just wrote it up today and I've only tried it on one data set, but
> at least on Iris it improves the default Perceptron's performance to
> 85% accuracy from 70%. Better than nothing! Of course it takes 100
> times as long when hyperopt is run serially, but .05 seconds and 5
> seconds are both pretty quick. (And who would have thought that the
> Perceptron would have 8 hyper-parameters??)
>
> I'm not planning to do any more work on this in the very short term,
> so if anyone is curious to adapt the Perceptron example to other
> algorithms, send PRs :)
>
> - James
>
> On Mon, Feb 11, 2013 at 4:10 PM, James Bergstra
> <james.bergs...@gmail.com> wrote:
>> Interesting to see this thread revived! FYI I've made hyperopt a lot
>> friendlier since that original posting.
>>
>> http://jaberg.github.com/hyperopt/
>>
>> pip install hyperopt
>>
>> 1. It has docs.
>> 2. The minimization interface is based on an fmin() function, that
>> should be pretty accessible.
>> 3. It can be installed straight from pypi
>> 4. It just depends on numpy, scipy, and networkx. (optional pymongo and nose)
>>
>> Adding new algorithms to it (SMBO based on GPs and regression trees)
>> is work in progress. The current non-trivial algorithm that's in there
>> (TPE) is probably relatively good for high-dimensional spaces, but for
>> lower-dimensional search spaces I think these other algos might be
>> more efficient. I'll keep the list posted on how that comes along (or
>> feel free to get in touch if you'd like to help out.)
>>
>> - James
>>
>> On Tue, Dec 6, 2011 at 10:36 AM, James Bergstra
>> <james.bergs...@gmail.com> wrote:
>>> On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <alexandre...@gmail.com> 
>>> wrote:
>>>> On Mon, Dec 5, 2011 at 16:26, James Bergstra <james.bergs...@gmail.com> 
>>>> wrote:
>>>>>
>>>>> This is definitely a good idea. I think randomly sampling is still
>>>>> useful though. It is not hard to get into settings where the grid is
>>>>> in theory very large and the user has a budget that is a tiny fraction
>>>>> of the full grid.
>>>>
>>>> I'd like to implement this, but I'm stuck on a nice way of specifying
>>>> distributions over each axis (i.e., sometimes you want to sample
>>>> across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes
>>>> you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious
>>>> and readable and flexible.
>>>
>>> This is essentially why the algorithms in my "hyperopt" project [1]
>>> are implemented as they are. They work for a variety of kinds of
>>> distributions (uniform, log-uniform, normal, log-normal, randint),
>>> including what I call "conditional" ones. For example, suppose you're
>>> trying to optimize all the elements of a learning pipeline, and even
>>> the choice of elements.  You only want to pick the PCA pre-processing
>>> parameters *if* you're actually doing PCA, because otherwise your
>>> parameter optimization algorithm might attribute the score (result /
>>> performance) to the PCA parameter choices that you know very well were
>>> irrelevant.
>>>
>>> hyperopt implementations are relatively tricky, but at this point I
>>> don't think they could be done in a straightforward simple way that
>>> would make them scikit-learn compatible.  I think scikit-learn users
>>> would be better served by specific hand-written hyper-parameter
>>> optimizers for certain specific, particularly useful pipelines.  Other
>>> customized pipelines can use grid search, random search, manual
>>> search, or the docs could maybe refer them to hyperopt, as it matures.
>>>
>>> - James
>>>
>>> [1]  https://github.com/jaberg/hyperopt

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_feb
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to