Re: [Scikit-learn-general] Hyperparameter optimization

James Bergstra Mon, 05 Dec 2011 10:31:42 -0800

I should probably not have scared ppl off speaking of a 250-job
budget.  My intuition would be that with 2-8 hyper-parameters, and 1-3
"significant" hyper-parameters, randomly sampling around 10-30 points
should be pretty reliable.


- James

On Mon, Dec 5, 2011 at 1:28 PM, James Bergstra <[email protected]> wrote:
> On Sat, Dec 3, 2011 at 6:32 AM, Olivier Grisel <[email protected]> 
> wrote:
>>> With regards to the random sampling, I am a bit worried that the results
>>> hold for a fair amount of points, and with a small amount of points
>>> (which is typically the situation in which many of us hide) it becomes
>>> very sensitive to the seed.
>>
>> I guess you should monitor the improvement before deciding to stop the 
>> search.
>
> My experience has been
>
> 1. that you start from an idea of a grid you'd like to try (ranges for
> hyper-parameters, intervals for each hyper-parameter that might make a
> difference),
>
> 2. you realize there's a huge number of points in the ideal grid, and
> you have a budget for like 250
>
> 3a. you pick a good grid that still gets "the most important part" , vs.
>
> 3b. you sample randomly in the original (huge) space.
>
> If you sample randomly in a space that is close to the grid you were
> going to try, but includes some of the finer resolution that you had
> to throw out to get down to 250 grid points, you should do better with
> 250 random points (3b) than your grid (3a).
>
> You're right that with just a few  (i.e. < 10) random samples, mileage
> will vary greatly... but that's not really the regime in which you can
> do a grid search anyway.
>
> I can hopefully offer more convincing evidence soon... I have a
> journal paper on this that has been accepted, but I still need to
> polish it up for publication.
>
> - James

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Hyperparameter optimization

Reply via email to