2011/10/28 Alexandre Gramfort <[email protected]>:
> class SingleSplit(object):
>
> def __init__(self, train_index, test_index):
> self.train_index = train_index
> self.test_index = test_index
>
> def __iter__(self):
> yield self.train_index, self.test_index
>
> is this more complicated than this?
It's just that in Andreas' case de datasets are already splitted. To
use the GridSearchCV you have to concatenate them to able to call the
fit API.
X_train, y_train = load_svmlight_file("/path/to/training/set.dat")
X_validation, y_validation = load_svmlight_file("/path/to/validation/set.dat")
params_grid = { ... }
cv = SingleSplit(np.arange(X_train.shape[0]),
np.arange(X_validation.shape[0]) + X_train.shape[0])
X_total = np.vstack((X_train, X_validation))
y_total = np.vstack((y_train, y_validation))
GridSearchCV(params_grid, cv=cv).fit(X_total, y_total)
This is a lot of complex boilerplate for the newcomer.
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn
about Cisco certifications, training, and career opportunities.
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general