Some fresh news from the hyperparameters tuning front-lines: http://jmlr.csail.mit.edu/papers/volume13/bergstra12a/bergstra12a.pdf
Some interesting snippets from the conclusion (I have not yet read the rest of the paper): """ We have shown that random experiments are more efficient than grid experiments for hyper-parameter optimization in the case of several learning algorithms on several data sets. Our analysis of the hyper-parameter response surface (Ψ) suggests that random experiments are more efficient because not all hyper- parameters are equally important to tune. Grid search experiments allocate too many trials to the exploration of dimensions that do not matter and suffer from poor coverage in dimensions that are important. """ """ Random experiments are also easier to carry out than grid experiments for practical reasons related to the statistical independence of every trial. • The experiment can be stopped any time and the trials form a complete experiment. • If extra computers become available, new trials can be added to an experiment without having to adjust the grid and commit to a much larger experiment. • Every trial can be carried out asynchronously. • If the computer carrying out a trial fails for any reason, its trial can be either abandoned or restarted without jeopardizing the experiment. """ I wonder how this would transpose to scikit-learn models that have often much fewer hyper-parameters that the average Deep Belief Network. Still it's very interesting food for thought if someone want's to dive into improving the model selection tooling in the scikit. Maybe a new GSoC topic? Anybody would be interested as a mentor or candidate? -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
