On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <[email protected]> wrote:
> On Mon, Dec 5, 2011 at 16:26, James Bergstra <[email protected]> wrote:
>>
>> This is definitely a good idea. I think randomly sampling is still
>> useful though. It is not hard to get into settings where the grid is
>> in theory very large and the user has a budget that is a tiny fraction
>> of the full grid.
>
> I'd like to implement this, but I'm stuck on a nice way of specifying
> distributions over each axis (i.e., sometimes you want to sample
> across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes
> you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious
> and readable and flexible.

This is essentially why the algorithms in my "hyperopt" project [1]
are implemented as they are. They work for a variety of kinds of
distributions (uniform, log-uniform, normal, log-normal, randint),
including what I call "conditional" ones. For example, suppose you're
trying to optimize all the elements of a learning pipeline, and even
the choice of elements.  You only want to pick the PCA pre-processing
parameters *if* you're actually doing PCA, because otherwise your
parameter optimization algorithm might attribute the score (result /
performance) to the PCA parameter choices that you know very well were
irrelevant.

hyperopt implementations are relatively tricky, but at this point I
don't think they could be done in a straightforward simple way that
would make them scikit-learn compatible.  I think scikit-learn users
would be better served by specific hand-written hyper-parameter
optimizers for certain specific, particularly useful pipelines.  Other
customized pipelines can use grid search, random search, manual
search, or the docs could maybe refer them to hyperopt, as it matures.

- James

[1]  https://github.com/jaberg/hyperopt

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to