Here, some results on the 20 newsgroups dataset:
Classifier train-time test-time error-rate
--------------------------------------------
5-nn 0.0047s 13.6651s 0.5916
random forest 263.3146s 3.9985s 0.2459
sgd 0.2265s 0.0657s 0.2604
Optimizing everything properly should allow to gain to some
percents. Here, the gist https://gist.github.com/arjoly/8732555
for those who wants to play
Best,
Arnaud
------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends. Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general