Am 17.04.2012 14:14, schrieb Olivier Grisel:
> Le 17 avril 2012 02:45, Gael Varoquaux<[email protected]>  a 
> écrit :
>> @scikit-learn developers:
>>
>> Hum...
>> http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/
> hahaha
>
>> The situation is that the authors of libSVM have chosen a solution that
>> leads to inconsistent estimator with bad statistical properties, but
>> works well on many datasets. I think it is wrong, but then, I am worried
>> that this might be a battle that we might not win.
>>
>> On the one hand, we really cannot have C the way the libSVM guy have
>> defined it, because parameter setting by cross-validation will not work.
>> On the other hand, it is clear that people keep tripping over this
>> difference. Should we introduce a different name, that way it forces
>> people to read the docs?
> Or we could revert back to `scale_C=False` by default and let
> statistically consistent people turn it on explicitly when they need
> it (i.e. to do model selection in the low `n_samples` case).
>
> This way people who don't read the doc (the majority of the users)
> will not fall in the libsvm-gives-different-results trap and will have
> the tools to not fall in the statistical inconsistency trap if they
> make the effort to read the doc.
>
+ .5

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to