Hi! It looks like you are using the old `sklearn.cross_validation`'s LeaveOneLabelOut cross-validator. It has been deprecated since v0.18.
Use the `LeaveOneLabelOut` from `sklearn.model_selection`, that should fix your issue I think (thought I have not looked into your code in detail). HTH! On Sun, Dec 4, 2016 at 9:12 PM, Ludovico Coletta <[email protected]> wrote: > Dear scikit experts, > > I'm struggling with the implementation of a nested cross validation. > > My data: I have 26 subjects (13 per class) x 6670 features. I used a > feature reduction algorithm (you may have heard about Boruta) to reduce the > dimensionality of my data. Problems start now: I defined LOSO as outer > partitioning schema. Therefore, for each of the 26 cv folds I used 24 > subjects for feature reduction. This lead to a different number of features > in each cv fold. Now, for each cv fold I would like to use the same 24 > subjects for hyperparameter optimization (SVM with rbf kernel). > > This is what I did: > > *cv = list(LeaveOneout(len(y))) # in y I stored the labels* > > *inner_train = [None] * len(y)* > > *inner_test = [None] * len(y)* > > *ii = 0* > > *while ii < len(y):* > * cv = list(LeaveOneOut(len(y))) * > * a = cv[ii][0]* > * a = a[:-1]* > * inner_train[ii] = a* > > * b = cv[ii][0]* > * b = np.array(b[((len(cv[0][0]))-1)])* > * inner_test[ii]=b* > > * ii = ii + 1* > > *custom_cv = zip(inner_train,inner_test) # inner cv* > > > *pipe_logistic = Pipeline([('scl', StandardScaler()),('clf', > SVC(kernel="rbf"))])* > > *parameters = [{'clf__C': np.logspace(-2, 10, 13), > 'clf__gamma':np.logspace(-9, 3, 13)}]* > > > > *scores = [None] * (len(y)) * > > *ii = 0* > > *while ii < len(scores):* > > * a = data[ii][0] # data for train* > * b = data[ii][1] # data for test* > * c = np.concatenate((a,b)) # shape: number of subjects * number of > features* > * d = cv[ii][0] # labels for train* > * e = cv[ii][1] # label for test* > * f = np.concatenate((d,e))* > > * grid_search = GridSearchCV(estimator=pipe_logistic, > param_grid=parameters, verbose=1, scoring='accuracy', cv= > zip(([custom_cv[ii][0]]), ([custom_cv[ii][1]])))* > > * scores[ii] = cross_validation.cross_val_score(grid_search, c, y[f], > scoring='accuracy', cv = zip(([cv[ii][0]]), ([cv[ii][1]])))* > > * ii = ii + 1* > > > > However, I got the following error message: index 25 is out of bounds for > size 25 > > Would it be so bad if I do not perform a nested LOSO but I use the default > setting for hyperparameter optimization? > > Any help would be really appreciated > > > _______________________________________________ > scikit-learn mailing list > [email protected] > https://mail.python.org/mailman/listinfo/scikit-learn > > -- Raghav RV https://github.com/raghavrv
_______________________________________________ scikit-learn mailing list [email protected] https://mail.python.org/mailman/listinfo/scikit-learn
