On Fri, Oct 28, 2011 at 11:27 PM, Olivier Grisel
<[email protected]> wrote:

> This is a lot of complex boilerplate for the newcomer.

Plus, that would be a waste of memory and cpu time as the grid search
would re-split the data just after.

Lately I've been working on large-scale algorithms where it would be
very useful if I had a validation set directly in fit:

fit(X, y, X_val=None, y_val=None)

or

fit(X, y, percent_val=0)

For example, SGDClassifier could use it for early stopping (don't
choose the last weight vector but the best one against the validation
set) or for efficient tuning of the regularization hyperparameter.

Mathieu

------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to