Le 17 avril 2012 09:20, Gael Varoquaux a écrit :
> On Tue, Apr 17, 2012 at 04:16:47PM +0200, Gael Varoquaux wrote:
>> What do people think about my solution 'scale_params'? I thought that it
>> was a way to make everybody happy, but I don't seem to be getting
>> traction.
>
> I have opened a ticke
On Tue, Apr 17, 2012 at 07:27:39AM -0700, Olivier Grisel wrote:
> > Btw I feel it is somewhat of a problem to undo what was done in the current
> > master, as I would guess some people are already working with that.
> I a assume that people working with the master can expect this kind of
> semanti
On Tue, Apr 17, 2012 at 04:22:50PM +0200, Alexandre Gramfort wrote:
> what would be the semantic of scale_params?
scale_params=False by default
scale_params=True would scale the parameters by a data-dependent terms
such as C_min.
> shall we touch every estimator
No, because for some estimators w
On Tue, Apr 17, 2012 at 04:16:47PM +0200, Gael Varoquaux wrote:
> What do people think about my solution 'scale_params'? I thought that it
> was a way to make everybody happy, but I don't seem to be getting
> traction.
I have opened a ticket with this idea.
As it's Jaques' birthday today, he was
On Wed, Apr 18, 2012 at 01:10:12AM +0900, Mathieu Blondel wrote:
>On Tue, Apr 17, 2012 at 11:16 PM, Gael Varoquaux
><[1][email protected]> wrote:
> What do people think about my solution 'scale_params'? I thought that it
> was a way to make everybody happy, but I don'
On Tue, Apr 17, 2012 at 11:16 PM, Gael Varoquaux <
[email protected]> wrote:
>
> What do people think about my solution 'scale_params'? I thought that it
> was a way to make everybody happy, but I don't seem to be getting
> traction.
>
>
What would be the default value for "scale_param
Le 17 avril 2012 07:23, Andreas Mueller a écrit :
>
> Btw I feel it is somewhat of a problem to undo what was done in the current
> master, as I would guess some people are already working with that.
I a assume that people working with the master can expect this kind of
semantic shift to occur fr
what would be the semantic of scale_params?
shall we touch every estimator or assume scale_params=True if not
present as attribute?
Alex
On Tue, Apr 17, 2012 at 4:16 PM, Gael Varoquaux
wrote:
> On Tue, Apr 17, 2012 at 03:46:10PM +0200, Andreas Mueller wrote:
>> I agree that they show that scali
Am 17.04.2012 16:16, schrieb Gael Varoquaux:
> On Tue, Apr 17, 2012 at 03:46:10PM +0200, Andreas Mueller wrote:
>> I agree that they show that scaling C seems better.
>> BUT: I would not agree with Gael that scale_C=False is broken.
>> Even with few samples, it is very hard to actually generate the
On Tue, Apr 17, 2012 at 03:46:10PM +0200, Andreas Mueller wrote:
> I agree that they show that scaling C seems better.
> BUT: I would not agree with Gael that scale_C=False is broken.
> Even with few samples, it is very hard to actually generate the problem.
> You need to have a learning problem
On Tue, Apr 17, 2012 at 10:48 PM, Andreas Mueller
wrote:
> **
> Am 17.04.2012 15:45, schrieb Mathieu Blondel:
>
>
>
> On Tue, Apr 17, 2012 at 10:31 PM, Olivier Grisel > wrote:
>
>>
>> 1- use C and scale_C=False by default and document extensively the
>> importance of scale_C=True when doing model
Am 17.04.2012 15:45, schrieb Mathieu Blondel:
On Tue, Apr 17, 2012 at 10:31 PM, Olivier Grisel
mailto:[email protected]>> wrote:
1- use C and scale_C=False by default and document extensively the
importance of scale_C=True when doing model selection with small
number of s
Am 17.04.2012 15:06, schrieb Alexandre Gramfort:
> what's killing me is that andy's plot shows that scale_C is the way to
> go so it's not just me. Also libsvm/liblinear bindings are the only
> models that have a regularization parameter that depends on the
> numbers of samples. Either we stick to
On Tue, Apr 17, 2012 at 10:31 PM, Olivier Grisel
wrote:
>
> 1- use C and scale_C=False by default and document extensively the
> importance of scale_C=True when doing model selection with small
> number of samples. (I am ok for the ugly warning in the grid search
> class).
>
Setting scale_C to No
2012/4/17 Olivier Grisel :
> ...
>
> Has anybody tried to confirm that this is a libsvm / liblinear
> specific thing? How do shogun, svmlight and other non-libsvm SVM
> implementation deal with this?
As far as I can tell svm^light uses the same formulation as libsvm;
For svm^rank they changed it t
Le 17 avril 2012 06:31, Olivier Grisel a écrit :
>
> 2- use alpha as in the rest of the other scikit-learn models and have
> the default value of alpha set to None or "auto" that will be set to
> `n_samples` in the fit method since `C=1` (unscaled) gives a good
> baseline in practice on normalized
Le 17 avril 2012 06:06, Alexandre Gramfort
a écrit :
> what's killing me is that andy's plot shows that scale_C is the way to
> go so it's not just me. Also libsvm/liblinear bindings are the only
> models that have a regularization parameter that depends on the
> numbers of samples.
Has anybody t
On Tue, Apr 17, 2012 at 10:07:38PM +0900, Mathieu Blondel wrote:
>We can rename scale_C to scale_penalty or scale_params and use this option
>wherever there's a dataset size-dependent option in the constructor...
Please, will you stop reading my mind. It's a bit disturbing. Especially
sinc
On Apr 17, 2012, at 15:53 , Alexandre Gramfort wrote:
>> I think just moving from a train set to a test set would be problematic for
>> small n_samples.
>
> what do you suggest?
>
I agree with your scale_C=None suggestion because it would (in theory) force
the user to become aware of what th
On Tue, Apr 17, 2012 at 02:39:33PM +0200, Alexandre Gramfort wrote:
> ok I give up… Let's move back to scale_C=None that spits a warning to
> strongly suggest users to make their choice.
We could do it, but it's broken. Basically this choice would be accepting
that in the small sample situation yo
On Tue, Apr 17, 2012 at 9:56 PM, Lars Buitinck wrote:
>
> I'm not very fond of adding estimator-specific heuristics to
> general-purpose modules...
>
We can rename scale_C to scale_penalty or scale_params and use this option
wherever there's a dataset size-dependent option in the constructor...
what's killing me is that andy's plot shows that scale_C is the way to
go so it's not just me. Also libsvm/liblinear bindings are the only
models that have a regularization parameter that depends on the
numbers of samples. Either we stick to libsvm and we have an
inconsistent grid search + an incon
On 17/04/2012, Gael Varoquaux wrote:
> On Tue, Apr 17, 2012 at 02:56:13PM +0200, Lars Buitinck wrote:
>> >> > This way people who don't read the doc (the majority of the users)
>> >> > will not fall in the libsvm-gives-different-results trap and will
>> >> > have
>> >> > the tools to not fall in t
On Tue, Apr 17, 2012 at 02:56:13PM +0200, Lars Buitinck wrote:
> >> > This way people who don't read the doc (the majority of the users)
> >> > will not fall in the libsvm-gives-different-results trap and will have
> >> > the tools to not fall in the statistical inconsistency trap if they
> >> > ma
> I'm not very fond of adding estimator-specific heuristics to
> general-purpose modules...
I agree…
it looks like a deadlock…
Alex
--
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data app
Op 17 april 2012 14:28 heeft Mathieu Blondel
het volgende geschreven:
> On Tue, Apr 17, 2012 at 9:17 PM, Andreas Mueller
> wrote:
>>
>> > This way people who don't read the doc (the majority of the users)
>> > will not fall in the libsvm-gives-different-results trap and will have
>> > the tools t
> I think just moving from a train set to a test set would be problematic for
> small n_samples.
what do you suggest?
Alex
--
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications
I think just moving from a train set to a test set would be problematic for
small n_samples.
Vlad
On Apr 17, 2012, at 15:48 , Olivier Grisel wrote:
> Le 17 avril 2012 05:39, Gael Varoquaux a
> écrit :
>> On Tue, Apr 17, 2012 at 03:35:26PM +0300, Dimitrios Pritsos wrote:
>>>If you would li
On Tue, Apr 17, 2012 at 05:48:14AM -0700, Olivier Grisel wrote:
> _it does not work_ => grid search / model selection does not work.
More generally, you value for C must change depending on the number of
samples that you have.
G
---
Le 17 avril 2012 05:39, Gael Varoquaux a écrit :
> On Tue, Apr 17, 2012 at 03:35:26PM +0300, Dimitrios Pritsos wrote:
>> If you would like the opinion of user (i.e. me) I think this is the best
>> solution for intuitive use of the Lib. And having scale_C=False as
>> default.
>
> For small
>> > This way people who don't read the doc (the majority of the users)
>> > will not fall in the libsvm-gives-different-results trap and will have
>> > the tools to not fall in the statistical inconsistency trap if they
>> > make the effort to read the doc.
>>
>> + .5
>
> +1
>
> And we could add a
On Tue, Apr 17, 2012 at 03:35:26PM +0300, Dimitrios Pritsos wrote:
>If you would like the opinion of user (i.e. me) I think this is the best
>solution for intuitive use of the Lib. And having scale_C=False as
>default.
For small number of samples, _it does not work_. Period, there is n
On 04/17/2012 03:28 PM, Mathieu Blondel wrote:
On Tue, Apr 17, 2012 at 9:17 PM, Andreas Mueller
mailto:[email protected]>> wrote:
> This way people who don't read the doc (the majority of the users)
> will not fall in the libsvm-gives-different-results trap and
will h
On Tue, Apr 17, 2012 at 9:17 PM, Andreas Mueller
wrote:
> > This way people who don't read the doc (the majority of the users)
> > will not fall in the libsvm-gives-different-results trap and will have
> > the tools to not fall in the statistical inconsistency trap if they
> > make the effort to r
Am 17.04.2012 14:14, schrieb Olivier Grisel:
> Le 17 avril 2012 02:45, Gael Varoquaux a
> écrit :
>> @scikit-learn developers:
>>
>> Hum...
>> http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/
> hahaha
>
>> The situation is that the authors of libSVM have chosen a solu
Le 17 avril 2012 02:45, Gael Varoquaux a écrit :
> @scikit-learn developers:
>
> Hum...
> http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/
hahaha
> The situation is that the authors of libSVM have chosen a solution that
> leads to inconsistent estimator with bad stat
Am 17.04.2012 11:45, schrieb Gael Varoquaux:
> @scikit-learn developers:
>
> Hum...
> http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/
My office mate just asked me whether that was the scikits users in front
and the developers in the back :-/
--
On Tue, Apr 17, 2012 at 11:45 AM, Gael Varoquaux <
[email protected]> wrote:
>
> On the one hand, we really cannot have C the way the libSVM guy have
> defined it, because parameter setting by cross-validation will not work.
> On the other hand, it is clear that people keep tripping ove
@scikit-learn developers:
Hum...
http://www.flickr.com/photos/scriptingnews/3503448168/sizes/o/in/photostream/
The situation is that the authors of libSVM have chosen a solution that
leads to inconsistent estimator with bad statistical properties, but
works well on many datasets. I think it is wr
Hello G,
Yes you are right the scale_C should be False for working as expected.
Great because I prefer to work with the latest version.
Thank you G
Dimitrios
On 04/17/2012 12:13 PM, Dimitrios Pritsos wrote:
>
> Ok I will do that now and I will let you know in 45 min
>
> On 04/17/2012 12:10
Ok I will do that now and I will let you know in 45 min
On 04/17/2012 12:10 PM, Gael Varoquaux wrote:
> On Tue, Apr 17, 2012 at 12:08:46PM +0300, Dimitrios Pritsos wrote:
>> I was running a test using SVC(c=1, kernel='linear') and I found that
>> for the latest version of sklearn the results are
On Tue, Apr 17, 2012 at 12:08:46PM +0300, Dimitrios Pritsos wrote:
> I was running a test using SVC(c=1, kernel='linear') and I found that
> for the latest version of sklearn the results are WRONG!
What does 'wrong' mean?
Something that changed in the scikit, is that the 'c' is scaled by the
num
Hello List,
I was running a test using SVC(c=1, kernel='linear') and I found that
for the latest version of sklearn the results are WRONG!
So I rolled back with git to this HEAD
commit 5c2a8696e3184fdb5e2ca5c55e61fe29ebd37fbb
Author: Andreas Mueller
Date: Mon Jan 23 21:13:55 2012 +0100
43 matches
Mail list logo