what's killing me is that andy's plot shows that scale_C is the way to
go so it's not just me. Also libsvm/liblinear bindings are the only
models that have a regularization parameter that depends on the
numbers of samples. Either we stick to libsvm and we have an
inconsistent grid search + an inconsistent behavior across estimators
or we go the clean way and we take the risk of having people reporting
"SERIOUS BUGS"

pick your side…

Alex

On Tue, Apr 17, 2012 at 3:00 PM, Gael Varoquaux
<[email protected]> wrote:
> On Tue, Apr 17, 2012 at 02:56:13PM +0200, Lars Buitinck wrote:
>> >> > This way people who don't read the doc (the majority of the users)
>> >> > will not fall in the libsvm-gives-different-results trap and will have
>> >> > the tools to not fall in the statistical inconsistency trap if they
>> >> > make the effort to read the doc.
>
>> >> + .5
>
>> > +1
>
>> +1
>
> It seems to me that we are hearing here the people with large number of
> samples who do not have the problems that scale_C=False creates saying
> that they prefer this default choice.
>
> :(. Basically the impression that I have is that either choice we take,
> we are breaking the library for a set of users.
>
>> > And we could add a warning in grid_search.py:
>
>> > if not getattr(clf, "scale_C", True):
>> >     warning.warning("scale_C=False is not recommended when using grid
>> > search: see http:// for a discussion")
>
>> I'm not very fond of adding estimator-specific heuristics to
>> general-purpose modules...
>
> I agree. This is a clearly a code smell, telling us that something is
> wrong with our objects: they are unable to abstract out enough the
> details of the model.
>
> G
>
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to