On Sat, Dec 3, 2011 at 6:32 AM, Olivier Grisel <[email protected]> wrote: >> With regards to the random sampling, I am a bit worried that the results >> hold for a fair amount of points, and with a small amount of points >> (which is typically the situation in which many of us hide) it becomes >> very sensitive to the seed. > > I guess you should monitor the improvement before deciding to stop the search.
My experience has been 1. that you start from an idea of a grid you'd like to try (ranges for hyper-parameters, intervals for each hyper-parameter that might make a difference), 2. you realize there's a huge number of points in the ideal grid, and you have a budget for like 250 3a. you pick a good grid that still gets "the most important part" , vs. 3b. you sample randomly in the original (huge) space. If you sample randomly in a space that is close to the grid you were going to try, but includes some of the finer resolution that you had to throw out to get down to 250 grid points, you should do better with 250 random points (3b) than your grid (3a). You're right that with just a few (i.e. < 10) random samples, mileage will vary greatly... but that's not really the regime in which you can do a grid search anyway. I can hopefully offer more convincing evidence soon... I have a journal paper on this that has been accepted, but I still need to polish it up for publication. - James ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
