On Sat, Dec 3, 2011 at 6:32 AM, Olivier Grisel <[email protected]> wrote:
>> With regards to the random sampling, I am a bit worried that the results
>> hold for a fair amount of points, and with a small amount of points
>> (which is typically the situation in which many of us hide) it becomes
>> very sensitive to the seed.
>
> I guess you should monitor the improvement before deciding to stop the search.

My experience has been

1. that you start from an idea of a grid you'd like to try (ranges for
hyper-parameters, intervals for each hyper-parameter that might make a
difference),

2. you realize there's a huge number of points in the ideal grid, and
you have a budget for like 250

3a. you pick a good grid that still gets "the most important part" , vs.

3b. you sample randomly in the original (huge) space.

If you sample randomly in a space that is close to the grid you were
going to try, but includes some of the finer resolution that you had
to throw out to get down to 250 grid points, you should do better with
250 random points (3b) than your grid (3a).

You're right that with just a few  (i.e. < 10) random samples, mileage
will vary greatly... but that's not really the regime in which you can
do a grid search anyway.

I can hopefully offer more convincing evidence soon... I have a
journal paper on this that has been accepted, but I still need to
polish it up for publication.

- James

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to