2011/10/28 Andreas Mueller <[email protected]>: > Hi everybody. > This is about the grid_search and cross_validation modules. > Often, in particular when the dataset is large or the algorithm slow, > it is not feasible to do n-fold cross validation and people use > a single training/validation split to find hyperparameters. > > As far as I can see, this is not supported in sklearn. > Do you think it should be included as an option to > do grid searches? It is not really "cross" validation > but I think the cross_validation module would be > the right place for that. > > What do you think? > > Cheers, > Andy
Check the bootstrap CV generator that gives fine control over the size of both the train and test (validation) sub-samples and the number of "folds" you want and hence run your grid_search faster if you aren't afraid about too much variance in your estimates. http://scikit-learn.org/dev/modules/cross_validation.html#bootstrapping-cross-validation We could also extend the shuffle split to give similar flexibility: http://scikit-learn.org/dev/modules/cross_validation.html#random-permutations-cross-validation-a-k-a-shuffle-split -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ The demand for IT networking professionals continues to grow, and the demand for specialized networking skills is growing even more rapidly. Take a complimentary Learning@Cisco Self-Assessment and learn about Cisco certifications, training, and career opportunities. http://p.sf.net/sfu/cisco-dev2dev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
