Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

Olivier Grisel Sun, 24 Jun 2012 08:18:05 -0700

2012/6/24 Mathieu Blondel <[email protected]>:
> It's important to bear in mind that some parameters have a huge impact on
> performance and that just using the default ones may result in unfair
> comparisons. For example, SGDClassifier uses the quite small n_iter=5 by
> default whereas liblinear-based algorithms check that the solution is close
> enough to the optimum within some tolerance (tol=1e-3 by default).
> Increasing the tolerance may significantly accelerate the training without
> much accuracy loss. Also max_iter is harcoded to 1000 in liblinear (in case
> the algorithm doesn't converge).
>
> Another point is that not all algorithms are equal with respect to the
> number of hyperparameters. In principle, we should compare the time that it
> takes for training, including hyperparameter tuning.


I agree, still having such orders of magnitudes in mind is quite
useful for the practitioners new to the project.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Nice blog post comparing various scikit-learn classifier runtimes

Reply via email to