Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-10 Thread Andreas Mueller
On 04/01/2012 09:27 PM, Alexandre Gramfort wrote: Afaik, it was with a l1-penalized logistic. In my experience, l2-penalized models and less sensitive to choice of the penality parameter, and hinge loss (aka SVM) and less sensitive than l2 of logistic loss. > indeed. > >> I thin

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-01 Thread Alexandre Gramfort
>>> Afaik, it was with a l1-penalized logistic. In my experience, >>> l2-penalized models and less sensitive to choice of the penality >>> parameter, and hinge loss (aka SVM) and less sensitive than l2 of >>> logistic loss. indeed. > I think you need a dataset with n_features >> n_samples with ma

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-01 Thread Olivier Grisel
Le 1 avril 2012 16:38, Andreas a écrit : > On 04/01/2012 04:34 PM, Gael Varoquaux wrote: >> On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote: >> >>> @Alex, could you maybe give the setting again where you had >>> issues doing grid search without scale_C? >>> >> Afaik, it was with a l1-penal

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-01 Thread Andreas
On 04/01/2012 04:34 PM, Gael Varoquaux wrote: > On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote: > >> @Alex, could you maybe give the setting again where you had >> issues doing grid search without scale_C? >> > Afaik, it was with a l1-penalized logistic. In my experience, > l2-pe

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-01 Thread Gael Varoquaux
On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote: > @Alex, could you maybe give the setting again where you had > issues doing grid search without scale_C? Afaik, it was with a l1-penalized logistic. In my experience, l2-penalized models and less sensitive to choice of the penality paramete

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-04-01 Thread Andreas
> Something that bothers me though, is that with libsvm, C=1 or C=10 > seems to be a reasonable default that work well both for dataset with > size n_samples=100 and n_samples=1 (by playing with the range of > datasets available in the scikit). On the other hand alpha would have > to be grid

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-22 Thread Paolo Losi
On Thu, Mar 22, 2012 at 2:11 AM, Olivier Grisel wrote: > Something that bothers me though, is that with libsvm, C=1 or C=10 > seems to be a reasonable default that work well both for dataset with > size n_samples=100 and n_samples=1 (by playing with the range of > datasets available in the sc

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-22 Thread Gael Varoquaux
On Thu, Mar 22, 2012 at 08:42:03AM +0100, Andreas wrote: > > It is also my gut feeling that dividing the regularization term by > > n_samples make the optimal value *more* dependent on the dataset size > > rather that the opposite. That might be the reason why C is not scaled > > in the SVM literat

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-22 Thread Alexandre Gramfort
hi, 1/ I agree with Gael. When writing the maths you don't want to carry around at every line n_samples and for the sparse regression that produces papers with no n_samples scaling but implementations that do scale (e.g. R packages like glmnet for example) 2/ What you say Olivier is interesting a

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-22 Thread Andreas
On 03/22/2012 02:11 AM, Olivier Grisel wrote: > Le 22 mars 2012 01:09, David Warde-Farley a écrit > : > >> >>> That said, I agree with James that the docs should be much more >>> explicit about what is going on, and how what we have differs from >>> libsvm. >>> >> I think that r

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Gael Varoquaux
On Wed, Mar 21, 2012 at 08:09:26PM -0400, David Warde-Farley wrote: > I think it's less about disagreeing with libsvm than disagreeing with the > notation of every textbook presentation I know of. I agree that libsvm is no > golden calf. But it is also the case for the lasso: the loss term is th

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Thu, Mar 22, 2012 at 9:09 AM, David Warde-Farley wrote: > In particular, doing 1 vs rest for logistic regression seems like > an odd choice when there is a perfectly good multiclass generalization of > logistic regression. Mathieu clarified to me last night how liblinear is > calculating "

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Thu, Mar 22, 2012 at 3:35 AM, James Bergstra wrote: > Also, isn't the feature normalization supposed to be done on a > fold-by-fold basis? If you're doing that, you have a different kernel > matrix in every fold anyway. Indeed, if you want really want to be clean, you would need to do that bu

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Olivier Grisel
Le 22 mars 2012 01:09, David Warde-Farley a écrit : > >> That said, I agree with James that the docs should be much more >> explicit about what is going on, and how what we have differs from >> libsvm. > > I think that renaming sklearn's scaled version of "C" is probably a start. > Using the name

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread David Warde-Farley
On 2012-03-21, at 7:25 PM, Gael Varoquaux wrote: > I'd like to stress that I don't think that following libsvm is much of a > goal per se. I understand that it make the life of someone like James > easier, because he knows libsvm well and can relate to it. I think it's less about disagreeing wi

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Gael Varoquaux
I've stayed quiet in this discussion because I was busy elsewhere. The good thing is that it has allowed me to hear to point of view of different people. Here is mine. First, the decision we took can be undone. It is not final, and the way that it should be taken is to make our user's life easiest

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread James Bergstra
On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel wrote: > Le 21 mars 2012 11:14, Mathieu Blondel a écrit : >> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: >> >>> Are there any other options? >> >> Another solution is to perform cross-validation using non-scaled C >> values, select the best one

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Olivier Grisel
Le 21 mars 2012 11:14, Mathieu Blondel a écrit : > On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: > >> Are there any other options? > > Another solution is to perform cross-validation using non-scaled C > values, select the best one and scale it before refitting with the > entire dataset (to tak

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-21 Thread Mathieu Blondel
On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote: > Are there any other options? Another solution is to perform cross-validation using non-scaled C values, select the best one and scale it before refitting with the entire dataset (to take into account that the entire dataset is bigger than a train

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-19 Thread James Bergstra
You all raise good points about why SVCa is a bad idea, I'm also against it now :) On Mon, Mar 19, 2012 at 7:09 AM, Olivier Grisel wrote: > Le 18 mars 2012 13:31, Lars Buitinck a écrit : >> Op 18 maart 2012 21:10 heeft Alexandre Gramfort >> het volgende geschreven: Another minor variation:

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-19 Thread Olivier Grisel
Le 18 mars 2012 13:31, Lars Buitinck a écrit : > Op 18 maart 2012 21:10 heeft Alexandre Gramfort > het volgende geschreven: >>> Another minor variation: make a second libsvm wrapper constructor that >>> only uses alpha, never C. >>> e.g.: svm = SVCa(1e-3) >> >> I am -1 on that. >> >> that would p

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-19 Thread Olivier Grisel
Le 18 mars 2012 09:22, Andreas a écrit : > On 03/18/2012 05:07 PM, James Bergstra wrote: >> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel   >> wrote: >> The alpha specified this way could (should?) have the same name and interpretation as the l2_regularization coefficient in the

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread Lars Buitinck
Op 18 maart 2012 21:10 heeft Alexandre Gramfort het volgende geschreven: >> Another minor variation: make a second libsvm wrapper constructor that >> only uses alpha, never C. >> e.g.: svm = SVCa(1e-3) > > I am -1 on that. > > that would probably confuse users and won't prevent them from using an

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread Alexandre Gramfort
> Another minor variation: make a second libsvm wrapper constructor that > only uses alpha, never C. > e.g.: svm = SVCa(1e-3) I am -1 on that. that would probably confuse users and won't prevent them from using an SVC with GridSearchCV with scale_C=False. Alex --

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread James Bergstra
On Sun, Mar 18, 2012 at 12:22 PM, Andreas wrote: > On 03/18/2012 05:07 PM, James Bergstra wrote: >> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel   >> wrote: >> The alpha specified this way could (should?) have the same name and interpretation as the l2_regularization coefficient in

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread Andreas
On 03/18/2012 05:07 PM, James Bergstra wrote: > On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel > wrote: > >>> The alpha specified this way could (should?) have the same name and >>> interpretation as the l2_regularization coefficient in the >>> SGDClassifier. >>> >> Would you conve

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread James Bergstra
On Sun, Mar 18, 2012 at 9:42 AM, Alexandre Gramfort wrote: >> I agree that it's a good idea to correct C for sample size when moving >> from a sub-problem to the full thing.  I just wouldn't use the word >> "optimal" to describe the new value of C that you get this way - it's >> an extrapolation,

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread James Bergstra
On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel wrote: >> The alpha specified this way could (should?) have the same name and >> interpretation as the l2_regularization coefficient in the >> SGDClassifier. > > Would you convert alpha into a C internal value or would you patch > libsvm / liblinea

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-18 Thread Alexandre Gramfort
> I agree that it's a good idea to correct C for sample size when moving > from a sub-problem to the full thing.  I just wouldn't use the word > "optimal" to describe the new value of C that you get this way - it's > an extrapolation, a good guess... possibly provably better than the > un-corrected

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-17 Thread Mathieu Blondel
On Sun, Mar 18, 2012 at 4:45 AM, James Bergstra wrote: > I agree that it's a good idea to correct C for sample size when moving > from a sub-problem to the full thing.  I just wouldn't use the word > "optimal" to describe the new value of C that you get this way - it's > an extrapolation, a good

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-17 Thread James Bergstra
On Sat, Mar 17, 2012 at 1:51 PM, Alexandre Gramfort wrote: >> This statement doesn't sound true. Generally hyper-parameters >> (especially ones to do with regularization) *do* depend on training >> set size, and not in such straightforward ways.  Data is never >> perfectly I.I.D. and sometimes it

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-17 Thread Alexandre Gramfort
> This statement doesn't sound true. Generally hyper-parameters > (especially ones to do with regularization) *do* depend on training > set size, and not in such straightforward ways.  Data is never > perfectly I.I.D. and sometimes it can be far from it. My impression > was that standard practice f

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-17 Thread James Bergstra
On Sat, Mar 17, 2012 at 4:44 AM, Alexandre Gramfort wrote: > without the scale_C the libsvm/liblinear bindings are the only models > whose hyperparameters > depend on the training set size. This statement doesn't sound true. Generally hyper-parameters (especially ones to do with regularization) *

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-17 Thread Alexandre Gramfort
hi guys, the scale_C is not released yet and not setting it in the current release raises a warning. But maybe we could be even more explicit to warn users. right now C is None by default and defaults to n_samples which amounts to the C=1 with scale_C=False which is the default behavior of libsvm

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread Andreas Mueller
On 03/17/2012 01:55 AM, Lars Buitinck wrote: > Op 17 maart 2012 01:30 heeft Andreas het > volgende geschreven: >> If we change the API, I would go for alpha as the current >> "scale_C=True" but optionally provide the "C", which behaves >> like the LibSVM parameter. > You mean we'd have two regular

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread Lars Buitinck
Op 17 maart 2012 01:30 heeft Andreas het volgende geschreven: > If we change the API, I would go for alpha as the current > "scale_C=True" but optionally provide the "C", which behaves > like the LibSVM parameter. You mean we'd have two regularization parameters? I'd find that confusing. -- Lar

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread Andreas
On 03/17/2012 12:40 AM, Olivier Grisel wrote: > Le 16 mars 2012 15:29, Andreas Müller a écrit : > >> On 03/16/2012 11:12 PM, James Bergstra wrote: >> >>> I was also recently bit by this scale_C business. It looks like the >>> decision to make scale_C=True the un-changeable default has al

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread Olivier Grisel
Le 16 mars 2012 15:29, Andreas Müller a écrit : > On 03/16/2012 11:12 PM, James Bergstra wrote: >> I was also recently bit by this scale_C business. It looks like the >> decision to make scale_C=True the un-changeable default has already >> been made, but when this is done *PLEASE* make this abund

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread Andreas Müller
On 03/16/2012 11:12 PM, James Bergstra wrote: > I was also recently bit by this scale_C business. It looks like the > decision to make scale_C=True the un-changeable default has already > been made, but when this is done *PLEASE* make this abundantly clear > in the documentation... my understanding

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-03-16 Thread James Bergstra
I was also recently bit by this scale_C business. It looks like the decision to make scale_C=True the un-changeable default has already been made, but when this is done *PLEASE* make this abundantly clear in the documentation... my understanding was that the C in the SVM equation means the thing th

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-16 Thread Vlad Niculae
Nitpick: shouldn't it be "set_libsvm_stdout(True)" or "enable_libsvm_stdout() / disable_libsvm_stdout()"? Vlad On Feb 15, 2012, at 13:30 , Alexandre Gramfort wrote: >>> sklearn.svm.enable_libsvm_stdout(True) > > +1 too > > A > > ---

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-15 Thread Alexandre Gramfort
>> sklearn.svm.enable_libsvm_stdout(True) +1 too A -- Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-15 Thread David Warde-Farley
On 2012-02-15, at 3:16 AM, Olivier Grisel wrote: > sklearn.svm.enable_libsvm_stdout(True) > > WDYT? I'm +1 for what it's worth. David -- Virtualization & Cloud Management Using Capacity Planning Cloud computing makes u

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-15 Thread Olivier Grisel
2012/2/15 Alexandre Gramfort : >> Fabian: any idea on how to do that? > > use : > > > libsvm.set_verbosity_wrap(1) > libsvm_sparse.set_verbosity_wrap(1) Thanks. > in svm/base.py > > maybe we could add a verbose flag to SVM estimators. Unfortunately we cannot make it a per-estimator parameter as

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-15 Thread Alexandre Gramfort
> Fabian: any idea on how to do that? use : libsvm.set_verbosity_wrap(1) libsvm_sparse.set_verbosity_wrap(1) in svm/base.py maybe we could add a verbose flag to SVM estimators. Alex -- Virtualization & Cloud Manageme

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-14 Thread Olivier Grisel
2012/2/14 David Warde-Farley : > On Tue, Feb 14, 2012 at 10:54:34PM +0100, Alexandre Gramfort wrote: >> hi Ian, >> >> yes you're right, however the dev version should be up to date: >> >> http://scikit-learn.org/dev/modules/svm.html#svc >> >> see note on scale_C parameter. > > Relatedly, on the sub

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-14 Thread David Warde-Farley
On Tue, Feb 14, 2012 at 10:54:34PM +0100, Alexandre Gramfort wrote: > hi Ian, > > yes you're right, however the dev version should be up to date: > > http://scikit-learn.org/dev/modules/svm.html#svc > > see note on scale_C parameter. Relatedly, on the subject of comparing SVM implementations, i

Re: [Scikit-learn-general] SVC documentation inaccuracy

2012-02-14 Thread Alexandre Gramfort
hi Ian, yes you're right, however the dev version should be up to date: http://scikit-learn.org/dev/modules/svm.html#svc see note on scale_C parameter. Alex On Tue, Feb 14, 2012 at 10:49 PM, Ian Goodfellow wrote: > This page says that SVC minimizes > sum(square(w)) + C * sum(margins) > > I be

[Scikit-learn-general] SVC documentation inaccuracy

2012-02-14 Thread Ian Goodfellow
This page says that SVC minimizes sum(square(w)) + C * sum(margins) I believe it actually minimizes sum(square(w)) + C * mean(margins) This may seem like a pedantic distinction but it's very important for being able to compare sklearn to other svm implementations. Am I correct that SVC minimizes