Dear scikit experts,

I'm struggling with the implementation of a nested cross validation.

My data: I have 26 subjects (13 per class) x 6670 features. I used a feature 
reduction algorithm (you may have heard about Boruta) to reduce the 
dimensionality of my data. Problems start now: I defined LOSO as outer 
partitioning schema. Therefore, for each of the 26 cv folds I used 24 subjects 
for feature reduction. This lead to a different number of features in each cv 
fold. Now, for each cv fold I would like to use the same 24 subjects for 
hyperparameter optimization (SVM with rbf kernel).

This is what I did:

cv = list(LeaveOneout(len(y))) # in y I stored the labels

inner_train = [None] * len(y)

inner_test =  [None] * len(y)

ii = 0

while ii < len(y):
    cv = list(LeaveOneOut(len(y)))
    a = cv[ii][0]
    a = a[:-1]
    inner_train[ii] = a

    b = cv[ii][0]
    b = np.array(b[((len(cv[0][0]))-1)])
    inner_test[ii]=b

    ii = ii + 1

custom_cv = zip(inner_train,inner_test) # inner cv


pipe_logistic = Pipeline([('scl', StandardScaler()),('clf', SVC(kernel="rbf"))])

parameters = [{'clf__C':  np.logspace(-2, 10, 13), 'clf__gamma':np.logspace(-9, 
3, 13)}]



scores = [None] * (len(y))

ii = 0

while ii < len(scores):

    a = data[ii][0] # data for train
    b = data[ii][1] # data for test
    c = np.concatenate((a,b)) # shape: number of subjects * number of features
    d = cv[ii][0] # labels for train
    e = cv[ii][1] # label for test
    f = np.concatenate((d,e))

    grid_search = GridSearchCV(estimator=pipe_logistic, param_grid=parameters, 
verbose=1, scoring='accuracy', cv= zip(([custom_cv[ii][0]]), 
([custom_cv[ii][1]])))

    scores[ii] = cross_validation.cross_val_score(grid_search, c, y[f], 
scoring='accuracy', cv = zip(([cv[ii][0]]), ([cv[ii][1]])))

    ii = ii + 1



However, I got the following error message: index 25 is out of bounds for size 
25

Would it be so bad if I do not perform a nested LOSO but I use the default 
setting for hyperparameter optimization?

Any help would be really appreciated

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to