2012/10/31 Afik Cohen <[email protected]>:
>
> Hah, thanks for the explanation :) But yes, the accuracy was terrible. In 
> fact,
> we just ran another cross-validated k=3 run with our current data, and got 
> these
> results:
>
> Training LogisticRegression(C=1.0, class_weight=None, dual=False,
> fit_intercept=True, intercept_scaling=1, penalty=l2, tol=0.0001)
> Running Cross-Validated accuracy testing with 3 folds.
> done [4276.551s]
> Results: Accuracy: 0.639312 (+/- 0.003300)
> Training time:  4276.55051398
> Input Data: (10480, 405562)
> Labels:  1144
>
> As you can see, 63% accuracy with 10480 document vectors with 405562 features.
> Pretty awful compared to LinearSVC which gives us upwards of 95%.

You need to find the optimal value for 'C' using grid search for both
LinearSVC and LogisticRegression to be able to compare their
respective performance and be able to tell that one of them yields
significantly better predictions than the other.

See the examples linked from the documentation for more details:

http://scikit-learn.org/stable/modules/grid_search.html

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to