On 12/05/2011 07:44 PM, Olivier Grisel wrote:
> 2011/12/5 Alexandre Passos<[email protected]>:
>> On Mon, Dec 5, 2011 at 13:31, James Bergstra<[email protected]>  
>> wrote:
>>> I should probably not have scared ppl off speaking of a 250-job
>>> budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3
>>> "significant" hyper-parameters, randomly sampling around 10-30 points
>>> should be pretty reliable.
>> So perhaps the best implementation of this is to first generate a grid
>> (from the usual arguments to sklearn's GridSearch), randomly sort it,
>> and iterate over these points until the budget is exhausted?
>>
>> Presented like this I can easily see why this is better than (a) going
>> over the grid in order until the budget is exhausted or (b) using a
>> coarser grid to match the budget. This would also be very easy to
>> implement in sklearn.
>>
>> Do I make sense?
> Yes. +1 for a pull request: one could just add a "budget" integer
> argument (None by default) to the existing GridSearchCV class.
>
+1

on a related note: what about coarse to fine grid-searches?
For categorial variables, that doesn't make much sense but
I think it does for many of the numerical variables.

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to