On Mon, Dec 05, 2011 at 01:41:53PM -0500, Alexandre Passos wrote:
> On Mon, Dec 5, 2011 at 13:31, James Bergstra <[email protected]> wrote:
> > I should probably not have scared ppl off speaking of a 250-job
> > budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3
> > "significant" hyper-parameters, randomly sampling around 10-30 points
> > should be pretty reliable.

> So perhaps the best implementation of this is to first generate a grid
> (from the usual arguments to sklearn's GridSearch), randomly sort it,
> and iterate over these points until the budget is exhausted?

Does sound reasonnable.

When doing grid searches, I find that an important aspect is that some
grid points take a fraction of the time of others. This is actually a big
motivation for doing things in parallel: with enough CPU (8) the time of
a grid search can be fully limited by the time of computing the fit for
the different folds on only one grid point.

Thus the notion of budget is relevant, but the right budget is not
exactly the number of fit points computed.

That said, taking this is account will probably make the code much more
complex, so I suggest that we put it on hold.

G

------------------------------------------------------------------------------
Cloud Services Checklist: Pricing and Packaging Optimization
This white paper is intended to serve as a reference, checklist and point of 
discussion for anyone considering optimizing the pricing and packaging model 
of a cloud services business. Read Now!
http://www.accelacomm.com/jaw/sfnl/114/51491232/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to