Hi,
I have a precomputed kernel of size NxN. I am using GridSearchCV to tune C 
parameter of SVM with kernel='precomputed' as follows:

C_range =10.**np.arange(-2,9)param_grid =dict(C=C_range)grid 
=GridSearchCV(SVC(kernel='precomputed'),param_grid=param_grid,cv=StratifiedKFold(y=data_label,n_folds=10))grid.fit(kernel,data_label)printgrid.best_score_
This works pretty fine, however since I use the full data for prediction (with 
grid.predict(kernel)), it overfits (I get precision/recall = 1.0 most of the 
times).
So I would like to first split my data to 10 chunks (9 for training, 1 for 
testing) with cross-validation, and in each fold, I want to run GridSearch to 
tune the C value on the training set, and test on the testing set.
In order to do this, I sliced the kernel matrix into 100x100 and 50x50 
submatrices where I run grid.fit() on one of them and grid.predict() on the 
other.
But I get the following error:
ValueError:X.shape[1]=50should be equal to 100,the number of features at 
training time
I guess training kernel should have the same dimension as testing kernel, but I 
don't understand why, because I simply compute np.dot(X, X.T) for 100x100, and 
for 50x50, hence the final kernel have different dimensions..
Thanks,
Ev
------------------------------------------------------------------------------
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to