I am a bit confused as to why you code doesn't crash on the call to the
scaler.
What is the shape of train_gram_matrix and test_gram_matrix?
On 01/06/2015 12:27 PM, Morgan Hoffman wrote:
Hi,
I am trying to do a k-fold cross validation with a precomputed kernel.
However, I end up with an error message that looks like this:
Traceback (most recent call last):
File "kfold_simple_data.py", line 64, in <module>
score = clf.score(test_gram_matrix, test_labels)
File "/usr/local/lib/python2.7/
dist-packages/sklearn/base.py", line 291, in score
return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",
line 467, in predict
y = super(BaseSVC, self).predict(X)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",
line 283, in predict
X = self._validate_for_predict(X)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",
line 401, in _validate_for_predict
(X.shape[1], self.shape_fit_[0]))
ValueError: X.shape[1] = 2 should be equal to 6, the number of samples
at training time
This is what my code looks like:
def cross_validate(data, folds, is_scaled):
X = data["values"]
Y = data["labels"]
kf = KFold(len(Y), folds, indices=False)
scores = []
for train, test in kf:
scaler = preprocessing.MinMaxScaler()
X_train, X_test, y_train, y_test = X[train], X[test], Y[train], Y[test]
training_data = OrderedDict()
for i in range(len(X_train)):
training_data[X_train[i]] = y_train[i]
train_gram_matrix = get_gram_matrix(training_data)
train_gram_matrix = scaler.fit_transform(train_gram_matrix)
train_labels = get_label_array(training_data)
test_data = OrderedDict()
for i in range(len(X_test)):
test_data[X_test[i]] = y_test[i]
test_gram_matrix = get_gram_matrix(test_data)
test_gram_matrix = scaler.transform(test_gram_matrix)
test_labels = get_label_array(test_data)
clf = svm.SVC(kernel='precomputed')
clf.fit(train_gram_matrix, train_labels)
print "Score:"
score = clf.score(test_gram_matrix, test_labels)
scores.append(score)
print score
Does anyone have an idea of what I may be doing wrong? Any help is
appreciated.
Thanks!
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general