I have been using the sonar data set (I believe this is a sample data set
used in many demonstrations of machine learning.) It is a two class data
set with 60 features with 208 training examples.

I have a questions about using sample weights in fitting the SVM model.

When I fit the model using scaled data, I get a test error of 10.3%. When I
fit the model using a sample weight vector of 1/N, I get a test error of
37%.

Here is the code:

w=np.ones(len(y_train))

clf=svm.SVC(kernel='rbf', C=10, gamma=.01)
clf.fit(x_tr_scaled,y_train)

score_scaled_tr=clf.score(x_tr_scaled,y_train)

score_scaled_test=clf.score(x_te_scaled,y_test)

w=w/sum(w)

clf1=svm.SVC(kernel='rbf', C=10, gamma=.01, probability=True)

clf1.fit(x_tr_scaled,y_train,sample_weight=w)

print "Training score with sample weights is ", clf1.score(x_tr,y_train)

print "Score with sample weights is", clf1.score(x_te_scaled,y_test)

What am I doing wrong here?

Also, when I tried this command:

Pr=predict_proba(x_tr_scaled)

I get the error that predict_proba is an undefined name. However, I got it
from this link:
http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC

Any help would be appreciated.

Anne Dwyer
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to