Am 17.04.2012 14:14, schrieb Olivier Grisel: > Le 17 avril 2012 02:45, Gael Varoquaux<[email protected]> a > écrit : >> @scikit-learn developers: >> >> Hum... >> http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/ > hahaha > >> The situation is that the authors of libSVM have chosen a solution that >> leads to inconsistent estimator with bad statistical properties, but >> works well on many datasets. I think it is wrong, but then, I am worried >> that this might be a battle that we might not win. >> >> On the one hand, we really cannot have C the way the libSVM guy have >> defined it, because parameter setting by cross-validation will not work. >> On the other hand, it is clear that people keep tripping over this >> difference. Should we introduce a different name, that way it forces >> people to read the docs? > Or we could revert back to `scale_C=False` by default and let > statistically consistent people turn it on explicitly when they need > it (i.e. to do model selection in the low `n_samples` case). > > This way people who don't read the doc (the majority of the users) > will not fall in the libsvm-gives-different-results trap and will have > the tools to not fall in the statistical inconsistency trap if they > make the effort to read the doc. > + .5
------------------------------------------------------------------------------ Better than sec? Nothing is better than sec when it comes to monitoring Big Data applications. Try Boundary one-second resolution app monitoring today. Free. http://p.sf.net/sfu/Boundary-dev2dev _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
