On Mon, Dec 5, 2011 at 4:38 PM, Alexandre Passos <[email protected]> wrote: > On Mon, Dec 5, 2011 at 16:26, James Bergstra <[email protected]> wrote: >> >> This is definitely a good idea. I think randomly sampling is still >> useful though. It is not hard to get into settings where the grid is >> in theory very large and the user has a budget that is a tiny fraction >> of the full grid. > > I'd like to implement this, but I'm stuck on a nice way of specifying > distributions over each axis (i.e., sometimes you want to sample > across orders of magnitude (say, 0.001, 0.01, 0.1, 1, etc), sometimes > you want to sample uniformly (0.1, 0.2, 0.3, 0.4 ...)) that is obvious > and readable and flexible.
This is essentially why the algorithms in my "hyperopt" project [1] are implemented as they are. They work for a variety of kinds of distributions (uniform, log-uniform, normal, log-normal, randint), including what I call "conditional" ones. For example, suppose you're trying to optimize all the elements of a learning pipeline, and even the choice of elements. You only want to pick the PCA pre-processing parameters *if* you're actually doing PCA, because otherwise your parameter optimization algorithm might attribute the score (result / performance) to the PCA parameter choices that you know very well were irrelevant. hyperopt implementations are relatively tricky, but at this point I don't think they could be done in a straightforward simple way that would make them scikit-learn compatible. I think scikit-learn users would be better served by specific hand-written hyper-parameter optimizers for certain specific, particularly useful pipelines. Other customized pipelines can use grid search, random search, manual search, or the docs could maybe refer them to hyperopt, as it matures. - James [1] https://github.com/jaberg/hyperopt ------------------------------------------------------------------------------ Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
