2012/6/24 Mathieu Blondel <[email protected]>: > It's important to bear in mind that some parameters have a huge impact on > performance and that just using the default ones may result in unfair > comparisons. For example, SGDClassifier uses the quite small n_iter=5 by > default whereas liblinear-based algorithms check that the solution is close > enough to the optimum within some tolerance (tol=1e-3 by default). > Increasing the tolerance may significantly accelerate the training without > much accuracy loss. Also max_iter is harcoded to 1000 in liblinear (in case > the algorithm doesn't converge). > > Another point is that not all algorithms are equal with respect to the > number of hyperparameters. In principle, we should compare the time that it > takes for training, including hyperparameter tuning.
I agree, still having such orders of magnitudes in mind is quite useful for the practitioners new to the project. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
