On 12/05/2011 07:44 PM, Olivier Grisel wrote: > 2011/12/5 Alexandre Passos<[email protected]>: >> On Mon, Dec 5, 2011 at 13:31, James Bergstra<[email protected]> >> wrote: >>> I should probably not have scared ppl off speaking of a 250-job >>> budget. My intuition would be that with 2-8 hyper-parameters, and 1-3 >>> "significant" hyper-parameters, randomly sampling around 10-30 points >>> should be pretty reliable. >> So perhaps the best implementation of this is to first generate a grid >> (from the usual arguments to sklearn's GridSearch), randomly sort it, >> and iterate over these points until the budget is exhausted? >> >> Presented like this I can easily see why this is better than (a) going >> over the grid in order until the budget is exhausted or (b) using a >> coarser grid to match the budget. This would also be very easy to >> implement in sklearn. >> >> Do I make sense? > Yes. +1 for a pull request: one could just add a "budget" integer > argument (None by default) to the existing GridSearchCV class. > +1
on a related note: what about coarse to fine grid-searches? For categorial variables, that doesn't make much sense but I think it does for many of the numerical variables. ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
