Re: [Scikit-learn-general] Cross validation with a pre-computed kernel

Andy Tue, 06 Jan 2015 09:46:23 -0800

The kernel matrix at test time needs to be the kernel between the testdata and the training data.

Which I guess is not what get_gram_matrix does.

Why are you applying the MinMaxScaler to the gram matrix? I'm not surethat makes sense...

Without the scaler you could just do


print(cross_val_score(SVC(kernel=precomputed), get_gram_matrix(X), Y))

with the MinMaxScaler you can do

pipe = make_pipeline(MinMaxScaler(), SVC(kernel='precomputed'))
print(cross_val_score(pipe, get_gram_matrix(X), Y))

which is a bit shorter than your code and resolves the need to worryabout the gram matrix ;)




On 01/06/2015 12:27 PM, Morgan Hoffman wrote:

Hi,

I am trying to do a k-fold cross validation with a precomputed kernel.However, I end up with an error message that looks like this:


Traceback (most recent call last):
  File "kfold_simple_data.py", line 64, in <module>
    score = clf.score(test_gram_matrix, test_labels)
  File "/usr/local/lib/python2.7/
dist-packages/sklearn/base.py", line 291, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",line 467, in predict

    y = super(BaseSVC, self).predict(X)

File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",line 283, in predict

    X = self._validate_for_predict(X)

File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",line 401, in _validate_for_predict

    (X.shape[1], self.shape_fit_[0]))

ValueError: X.shape[1] = 2 should be equal to 6, the number of samplesat training time


This is what my code looks like:

def cross_validate(data, folds, is_scaled):

      X = data["values"]
Y = data["labels"]

kf = KFold(len(Y), folds, indices=False)

scores = []

for train, test in kf:

scaler = preprocessing.MinMaxScaler()
X_train, X_test, y_train, y_test = X[train], X[test], Y[train], Y[test]

training_data = OrderedDict()
for i in range(len(X_train)):
training_data[X_train[i]] = y_train[i]

train_gram_matrix = get_gram_matrix(training_data)
train_gram_matrix = scaler.fit_transform(train_gram_matrix)
train_labels = get_label_array(training_data)

test_data = OrderedDict()
for i in range(len(X_test)):
test_data[X_test[i]] = y_test[i]

test_gram_matrix = get_gram_matrix(test_data)
test_gram_matrix = scaler.transform(test_gram_matrix)
test_labels = get_label_array(test_data)

clf = svm.SVC(kernel='precomputed')
clf.fit(train_gram_matrix, train_labels)

print "Score:"
score = clf.score(test_gram_matrix, test_labels)
scores.append(score)
print score

Does anyone have an idea of what I may be doing wrong? Any help isappreciated.


Thanks!


------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net

_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Cross validation with a pre-computed kernel

Reply via email to