You changed the labels only once, and have a test-set size of 4? I would imagine that is where that comes from. If you repeat over different assignments, you will get 50/50.
On 04/27/2015 11:33 AM, Fabrizio Fasano wrote: > Dear Andy, > > Yes, the classes have the same size, 8 and 8 > > this is one example of code I used to cross validate classification (I used > here StratifiedShuffleSplit, but I also used other methods as leave one out > or simple 4-fold cross validation, and the result didn't change so much) > > from sklearn.cross_validation import StratifiedShuffleSplit > sss = StratifiedShuffleSplit(y, 100, test_size=0.25, random_state=0) > clf = svm.LinearSVC(penalty="l1", dual=False, C=1, random_state=1) > > cv_scores=[] > for train_index, test_index in sss: > X_train, X_test = X_scaled[train_index], X_scaled[test_index] > y_train, y_test = y[train_index], y[test_index] > clf.fit(X_train, y_train) > y_pred = clf.predict(X_test) > cv_scores.append(np.sum(y_pred == y_test) / float(np.size(y_test))) > > print "Accuracy ", np.ceil(100*np.mean(cv_scores)), "+/-", > np.ceil(200*np.std(cv_scores)) > > > > > On Apr 26, 2015, at 7:50 PM, Andy wrote: > >> Your expectation is right, if you randomly assign labels, you shouldn't >> get more than 50% correct with a large enough dataset. >> I imagine there is some issue in how you shuffled the labels. Without >> the code, it is hard to tell. >> Are you sure the classes have the same size? >> >> On 04/26/2015 11:22 AM, Fabrizio Fasano wrote: >>> Dear Andreas, >>> >>> Thanks a lot for your help, >>> >>> about the random assignment of values to my labels y. What I mean is that >>> being suspicious about the too good performances, I changed the labels >>> manually, retaining the 50% 1,0 but in different orders, and the labels >>> were always predicted very well, with accuracy no lower than 60%. I mean, >>> by chance I aspected values lower than 50% as well as values higher than >>> 50%. I didn't perform an exhaustive test (I only did it manually for few >>> combinations)... >>> >>> Fabrizio >>> ------------------------------------------------------------------------------ >>> One dashboard for servers and applications across Physical-Virtual-Cloud >>> Widest out-of-the-box monitoring support with 50+ applications >>> Performance metrics, stats and reports that give you Actionable Insights >>> Deep dive visibility with transaction tracing using APM Insight. >>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> Scikit-learn-general@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> >> ------------------------------------------------------------------------------ >> One dashboard for servers and applications across Physical-Virtual-Cloud >> Widest out-of-the-box monitoring support with 50+ applications >> Performance metrics, stats and reports that give you Actionable Insights >> Deep dive visibility with transaction tracing using APM Insight. >> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general