Dear Andy, Yes, the classes have the same size, 8 and 8
this is one example of code I used to cross validate classification (I used here StratifiedShuffleSplit, but I also used other methods as leave one out or simple 4-fold cross validation, and the result didn't change so much) from sklearn.cross_validation import StratifiedShuffleSplit sss = StratifiedShuffleSplit(y, 100, test_size=0.25, random_state=0) clf = svm.LinearSVC(penalty="l1", dual=False, C=1, random_state=1) cv_scores=[] for train_index, test_index in sss: X_train, X_test = X_scaled[train_index], X_scaled[test_index] y_train, y_test = y[train_index], y[test_index] clf.fit(X_train, y_train) y_pred = clf.predict(X_test) cv_scores.append(np.sum(y_pred == y_test) / float(np.size(y_test))) print "Accuracy ", np.ceil(100*np.mean(cv_scores)), "+/-", np.ceil(200*np.std(cv_scores)) On Apr 26, 2015, at 7:50 PM, Andy wrote: > Your expectation is right, if you randomly assign labels, you shouldn't > get more than 50% correct with a large enough dataset. > I imagine there is some issue in how you shuffled the labels. Without > the code, it is hard to tell. > Are you sure the classes have the same size? > > On 04/26/2015 11:22 AM, Fabrizio Fasano wrote: >> Dear Andreas, >> >> Thanks a lot for your help, >> >> about the random assignment of values to my labels y. What I mean is that >> being suspicious about the too good performances, I changed the labels >> manually, retaining the 50% 1,0 but in different orders, and the labels were >> always predicted very well, with accuracy no lower than 60%. I mean, by >> chance I aspected values lower than 50% as well as values higher than 50%. I >> didn't perform an exhaustive test (I only did it manually for few >> combinations)... >> >> Fabrizio >> ------------------------------------------------------------------------------ >> One dashboard for servers and applications across Physical-Virtual-Cloud >> Widest out-of-the-box monitoring support with 50+ applications >> Performance metrics, stats and reports that give you Actionable Insights >> Deep dive visibility with transaction tracing using APM Insight. >> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general