2012/10/31 Afik Cohen <[email protected]>: > > Hah, thanks for the explanation :) But yes, the accuracy was terrible. In > fact, > we just ran another cross-validated k=3 run with our current data, and got > these > results: > > Training LogisticRegression(C=1.0, class_weight=None, dual=False, > fit_intercept=True, intercept_scaling=1, penalty=l2, tol=0.0001) > Running Cross-Validated accuracy testing with 3 folds. > done [4276.551s] > Results: Accuracy: 0.639312 (+/- 0.003300) > Training time: 4276.55051398 > Input Data: (10480, 405562) > Labels: 1144 > > As you can see, 63% accuracy with 10480 document vectors with 405562 features. > Pretty awful compared to LinearSVC which gives us upwards of 95%.
You need to find the optimal value for 'C' using grid search for both LinearSVC and LogisticRegression to be able to compare their respective performance and be able to tell that one of them yields significantly better predictions than the other. See the examples linked from the documentation for more details: http://scikit-learn.org/stable/modules/grid_search.html -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
