It's important to bear in mind that some parameters have a huge impact on
performance and that just using the default ones may result in unfair
comparisons. For example, SGDClassifier uses the quite small n_iter=5 by
default whereas liblinear-based algorithms check that the solution is close
enough to the optimum within some tolerance (tol=1e-3 by default).
Increasing the tolerance may significantly accelerate the training without
much accuracy loss. Also max_iter is harcoded to 1000 in liblinear (in case
the algorithm doesn't converge).
Another point is that not all algorithms are equal with respect to the
number of hyperparameters. In principle, we should compare the time that it
takes for training, including hyperparameter tuning.
Mathieu
On Sun, Jun 24, 2012 at 9:26 PM, Olivier Grisel <[email protected]>wrote:
> Here is the link:
>
>
> http://blog.explainmydata.com/2012/06/ntrain-24853-ntest-25147-ncorrupt.html
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general