Re: [Scikit-learn-general] Hyperparameter optimization

James Bergstra Mon, 11 Feb 2013 13:11:56 -0800

Interesting to see this thread revived! FYI I've made hyperopt a lot
friendlier since that original posting.


http://jaberg.github.com/hyperopt/

pip install hyperopt

1. It has docs.
2. The minimization interface is based on an fmin() function, that
should be pretty accessible.
3. It can be installed straight from pypi
4. It just depends on numpy, scipy, and networkx. (optional pymongo and nose)

Adding new algorithms to it (SMBO based on GPs and regression trees)
is work in progress. The current non-trivial algorithm that's in there
(TPE) is probably relatively good for high-dimensional spaces, but for
lower-dimensional search spaces I think these other algos might be
more efficient. I'll keep the list posted on how that comes along (or
feel free to get in touch if you'd like to help out.)

- James

On Tue, Dec 6, 2011 at 10:36 AM, James Bergstra
<james.bergs...@gmail.com> wrote:
> On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <alexandre...@gmail.com> 
> wrote:
>> On Mon, Dec 5, 2011 at 16:26, James Bergstra <james.bergs...@gmail.com> 
>> wrote:
>>>
>>> This is definitely a good idea. I think randomly sampling is still
>>> useful though. It is not hard to get into settings where the grid is
>>> in theory very large and the user has a budget that is a tiny fraction
>>> of the full grid.
>>
>> I'd like to implement this, but I'm stuck on a nice way of specifying
>> distributions over each axis (i.e., sometimes you want to sample
>> across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes
>> you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious
>> and readable and flexible.
>
> This is essentially why the algorithms in my "hyperopt" project [1]
> are implemented as they are. They work for a variety of kinds of
> distributions (uniform, log-uniform, normal, log-normal, randint),
> including what I call "conditional" ones. For example, suppose you're
> trying to optimize all the elements of a learning pipeline, and even
> the choice of elements.  You only want to pick the PCA pre-processing
> parameters *if* you're actually doing PCA, because otherwise your
> parameter optimization algorithm might attribute the score (result /
> performance) to the PCA parameter choices that you know very well were
> irrelevant.
>
> hyperopt implementations are relatively tricky, but at this point I
> don't think they could be done in a straightforward simple way that
> would make them scikit-learn compatible.  I think scikit-learn users
> would be better served by specific hand-written hyper-parameter
> optimizers for certain specific, particularly useful pipelines.  Other
> customized pipelines can use grid search, random search, manual
> search, or the docs could maybe refer them to hyperopt, as it matures.
>
> - James
>
> [1]  https://github.com/jaberg/hyperopt

------------------------------------------------------------------------------
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013 
and get the hardware for free! Learn more.
http://p.sf.net/sfu/sophos-d2d-feb
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to