In my case my interest / priority in this pattern is more in being
able to do onliners in an intereracive IPython session rather than
avoiding copy on large scale data (although this is interesting too).
>>> X_train, y_train, X_test, y_test = load_svmlight_files('train.dat',
>>> 'test.dat')
>>> clf = GridSearchValidation(SVC(), params={C=[1, 10, 100], gamma=[0.01,
>>> 0.0001]}).fit(X_train, y_train, X_val=X_test, y_val=y_test)
>>> clf.scores_
# see the scores
>>> clf.best_params_
# see the best params
This is the kind of pattern I get over and over again. Currently I
concatenate do:
$ cat train.dat test.dat > all.dat
And then use GridSearchCV on the full dataset with the Boostrap cv to
make it run quickly by controlling the number of bootstraps (set to 1)
for instance. But I am not strictly measuring the scores on the
provided test set.
--
Olivier
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn
about Cisco certifications, training, and career opportunities.
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general