On 04/01/2012 09:27 PM, Alexandre Gramfort wrote:
Afaik, it was with a l1-penalized logistic. In my experience,
l2-penalized models and less sensitive to choice of the penality
parameter, and hinge loss (aka SVM) and less sensitive than l2 of
logistic loss.
> indeed.
>
>> I thin
>>> Afaik, it was with a l1-penalized logistic. In my experience,
>>> l2-penalized models and less sensitive to choice of the penality
>>> parameter, and hinge loss (aka SVM) and less sensitive than l2 of
>>> logistic loss.
indeed.
> I think you need a dataset with n_features >> n_samples with ma
Le 1 avril 2012 16:38, Andreas a écrit :
> On 04/01/2012 04:34 PM, Gael Varoquaux wrote:
>> On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote:
>>
>>> @Alex, could you maybe give the setting again where you had
>>> issues doing grid search without scale_C?
>>>
>> Afaik, it was with a l1-penal
On 04/01/2012 04:34 PM, Gael Varoquaux wrote:
> On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote:
>
>> @Alex, could you maybe give the setting again where you had
>> issues doing grid search without scale_C?
>>
> Afaik, it was with a l1-penalized logistic. In my experience,
> l2-pe
On Sun, Apr 01, 2012 at 04:23:36PM +0200, Andreas wrote:
> @Alex, could you maybe give the setting again where you had
> issues doing grid search without scale_C?
Afaik, it was with a l1-penalized logistic. In my experience,
l2-penalized models and less sensitive to choice of the penality
paramete
> Something that bothers me though, is that with libsvm, C=1 or C=10
> seems to be a reasonable default that work well both for dataset with
> size n_samples=100 and n_samples=1 (by playing with the range of
> datasets available in the scikit). On the other hand alpha would have
> to be grid
On Thu, Mar 22, 2012 at 2:11 AM, Olivier Grisel
wrote:
> Something that bothers me though, is that with libsvm, C=1 or C=10
> seems to be a reasonable default that work well both for dataset with
> size n_samples=100 and n_samples=1 (by playing with the range of
> datasets available in the sc
On Thu, Mar 22, 2012 at 08:42:03AM +0100, Andreas wrote:
> > It is also my gut feeling that dividing the regularization term by
> > n_samples make the optimal value *more* dependent on the dataset size
> > rather that the opposite. That might be the reason why C is not scaled
> > in the SVM literat
hi,
1/ I agree with Gael. When writing the maths you don't want to carry
around at every line n_samples and for the sparse regression that
produces papers with no n_samples scaling but implementations that do
scale (e.g. R packages like glmnet for example)
2/ What you say Olivier is interesting a
On 03/22/2012 02:11 AM, Olivier Grisel wrote:
> Le 22 mars 2012 01:09, David Warde-Farley a écrit
> :
>
>>
>>> That said, I agree with James that the docs should be much more
>>> explicit about what is going on, and how what we have differs from
>>> libsvm.
>>>
>> I think that r
On Wed, Mar 21, 2012 at 08:09:26PM -0400, David Warde-Farley wrote:
> I think it's less about disagreeing with libsvm than disagreeing with the
> notation of every textbook presentation I know of. I agree that libsvm is no
> golden calf.
But it is also the case for the lasso: the loss term is th
On Thu, Mar 22, 2012 at 9:09 AM, David Warde-Farley
wrote:
> In particular, doing 1 vs rest for logistic regression seems like
> an odd choice when there is a perfectly good multiclass generalization of
> logistic regression. Mathieu clarified to me last night how liblinear is
> calculating "
On Thu, Mar 22, 2012 at 3:35 AM, James Bergstra
wrote:
> Also, isn't the feature normalization supposed to be done on a
> fold-by-fold basis? If you're doing that, you have a different kernel
> matrix in every fold anyway.
Indeed, if you want really want to be clean, you would need to do that
bu
Le 22 mars 2012 01:09, David Warde-Farley a écrit :
>
>> That said, I agree with James that the docs should be much more
>> explicit about what is going on, and how what we have differs from
>> libsvm.
>
> I think that renaming sklearn's scaled version of "C" is probably a start.
> Using the name
On 2012-03-21, at 7:25 PM, Gael Varoquaux wrote:
> I'd like to stress that I don't think that following libsvm is much of a
> goal per se. I understand that it make the life of someone like James
> easier, because he knows libsvm well and can relate to it.
I think it's less about disagreeing wi
I've stayed quiet in this discussion because I was busy elsewhere. The
good thing is that it has allowed me to hear to point of view of
different people. Here is mine.
First, the decision we took can be undone. It is not final, and the way
that it should be taken is to make our user's life easiest
On Wed, Mar 21, 2012 at 6:46 AM, Olivier Grisel
wrote:
> Le 21 mars 2012 11:14, Mathieu Blondel a écrit :
>> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
>>
>>> Are there any other options?
>>
>> Another solution is to perform cross-validation using non-scaled C
>> values, select the best one
Le 21 mars 2012 11:14, Mathieu Blondel a écrit :
> On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
>
>> Are there any other options?
>
> Another solution is to perform cross-validation using non-scaled C
> values, select the best one and scale it before refitting with the
> entire dataset (to tak
On Mon, Mar 19, 2012 at 1:22 AM, Andreas wrote:
> Are there any other options?
Another solution is to perform cross-validation using non-scaled C
values, select the best one and scale it before refitting with the
entire dataset (to take into account that the entire dataset is bigger
than a train
You all raise good points about why SVCa is a bad idea, I'm also
against it now :)
On Mon, Mar 19, 2012 at 7:09 AM, Olivier Grisel
wrote:
> Le 18 mars 2012 13:31, Lars Buitinck a écrit :
>> Op 18 maart 2012 21:10 heeft Alexandre Gramfort
>> het volgende geschreven:
Another minor variation:
Le 18 mars 2012 13:31, Lars Buitinck a écrit :
> Op 18 maart 2012 21:10 heeft Alexandre Gramfort
> het volgende geschreven:
>>> Another minor variation: make a second libsvm wrapper constructor that
>>> only uses alpha, never C.
>>> e.g.: svm = SVCa(1e-3)
>>
>> I am -1 on that.
>>
>> that would p
Le 18 mars 2012 09:22, Andreas a écrit :
> On 03/18/2012 05:07 PM, James Bergstra wrote:
>> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel
>> wrote:
>>
The alpha specified this way could (should?) have the same name and
interpretation as the l2_regularization coefficient in the
Op 18 maart 2012 21:10 heeft Alexandre Gramfort
het volgende geschreven:
>> Another minor variation: make a second libsvm wrapper constructor that
>> only uses alpha, never C.
>> e.g.: svm = SVCa(1e-3)
>
> I am -1 on that.
>
> that would probably confuse users and won't prevent them from using an
> Another minor variation: make a second libsvm wrapper constructor that
> only uses alpha, never C.
> e.g.: svm = SVCa(1e-3)
I am -1 on that.
that would probably confuse users and won't prevent them from using an
SVC with GridSearchCV with scale_C=False.
Alex
--
On Sun, Mar 18, 2012 at 12:22 PM, Andreas wrote:
> On 03/18/2012 05:07 PM, James Bergstra wrote:
>> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel
>> wrote:
>>
The alpha specified this way could (should?) have the same name and
interpretation as the l2_regularization coefficient in
On 03/18/2012 05:07 PM, James Bergstra wrote:
> On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel
> wrote:
>
>>> The alpha specified this way could (should?) have the same name and
>>> interpretation as the l2_regularization coefficient in the
>>> SGDClassifier.
>>>
>> Would you conve
On Sun, Mar 18, 2012 at 9:42 AM, Alexandre Gramfort
wrote:
>> I agree that it's a good idea to correct C for sample size when moving
>> from a sub-problem to the full thing. I just wouldn't use the word
>> "optimal" to describe the new value of C that you get this way - it's
>> an extrapolation,
On Sat, Mar 17, 2012 at 11:55 PM, Mathieu Blondel wrote:
>> The alpha specified this way could (should?) have the same name and
>> interpretation as the l2_regularization coefficient in the
>> SGDClassifier.
>
> Would you convert alpha into a C internal value or would you patch
> libsvm / liblinea
> I agree that it's a good idea to correct C for sample size when moving
> from a sub-problem to the full thing. I just wouldn't use the word
> "optimal" to describe the new value of C that you get this way - it's
> an extrapolation, a good guess... possibly provably better than the
> un-corrected
On Sun, Mar 18, 2012 at 4:45 AM, James Bergstra
wrote:
> I agree that it's a good idea to correct C for sample size when moving
> from a sub-problem to the full thing. I just wouldn't use the word
> "optimal" to describe the new value of C that you get this way - it's
> an extrapolation, a good
On Sat, Mar 17, 2012 at 1:51 PM, Alexandre Gramfort
wrote:
>> This statement doesn't sound true. Generally hyper-parameters
>> (especially ones to do with regularization) *do* depend on training
>> set size, and not in such straightforward ways. Data is never
>> perfectly I.I.D. and sometimes it
> This statement doesn't sound true. Generally hyper-parameters
> (especially ones to do with regularization) *do* depend on training
> set size, and not in such straightforward ways. Data is never
> perfectly I.I.D. and sometimes it can be far from it. My impression
> was that standard practice f
On Sat, Mar 17, 2012 at 4:44 AM, Alexandre Gramfort
wrote:
> without the scale_C the libsvm/liblinear bindings are the only models
> whose hyperparameters
> depend on the training set size.
This statement doesn't sound true. Generally hyper-parameters
(especially ones to do with regularization) *
hi guys,
the scale_C is not released yet and not setting it in the current release raises
a warning. But maybe we could be even more explicit to warn users.
right now C is None by default and defaults to n_samples which amounts
to the C=1 with scale_C=False which is the default behavior of libsvm
On 03/17/2012 01:55 AM, Lars Buitinck wrote:
> Op 17 maart 2012 01:30 heeft Andreas het
> volgende geschreven:
>> If we change the API, I would go for alpha as the current
>> "scale_C=True" but optionally provide the "C", which behaves
>> like the LibSVM parameter.
> You mean we'd have two regular
Op 17 maart 2012 01:30 heeft Andreas het
volgende geschreven:
> If we change the API, I would go for alpha as the current
> "scale_C=True" but optionally provide the "C", which behaves
> like the LibSVM parameter.
You mean we'd have two regularization parameters? I'd find that confusing.
--
Lar
On 03/17/2012 12:40 AM, Olivier Grisel wrote:
> Le 16 mars 2012 15:29, Andreas Müller a écrit :
>
>> On 03/16/2012 11:12 PM, James Bergstra wrote:
>>
>>> I was also recently bit by this scale_C business. It looks like the
>>> decision to make scale_C=True the un-changeable default has al
Le 16 mars 2012 15:29, Andreas Müller a écrit :
> On 03/16/2012 11:12 PM, James Bergstra wrote:
>> I was also recently bit by this scale_C business. It looks like the
>> decision to make scale_C=True the un-changeable default has already
>> been made, but when this is done *PLEASE* make this abund
On 03/16/2012 11:12 PM, James Bergstra wrote:
> I was also recently bit by this scale_C business. It looks like the
> decision to make scale_C=True the un-changeable default has already
> been made, but when this is done *PLEASE* make this abundantly clear
> in the documentation... my understanding
I was also recently bit by this scale_C business. It looks like the
decision to make scale_C=True the un-changeable default has already
been made, but when this is done *PLEASE* make this abundantly clear
in the documentation... my understanding was that the C in the SVM
equation means the thing th
Nitpick: shouldn't it be "set_libsvm_stdout(True)" or "enable_libsvm_stdout() /
disable_libsvm_stdout()"?
Vlad
On Feb 15, 2012, at 13:30 , Alexandre Gramfort wrote:
>>> sklearn.svm.enable_libsvm_stdout(True)
>
> +1 too
>
> A
>
> ---
>> sklearn.svm.enable_libsvm_stdout(True)
+1 too
A
--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be
On 2012-02-15, at 3:16 AM, Olivier Grisel wrote:
> sklearn.svm.enable_libsvm_stdout(True)
>
> WDYT?
I'm +1 for what it's worth.
David
--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes u
2012/2/15 Alexandre Gramfort :
>> Fabian: any idea on how to do that?
>
> use :
>
>
> libsvm.set_verbosity_wrap(1)
> libsvm_sparse.set_verbosity_wrap(1)
Thanks.
> in svm/base.py
>
> maybe we could add a verbose flag to SVM estimators.
Unfortunately we cannot make it a per-estimator parameter as
> Fabian: any idea on how to do that?
use :
libsvm.set_verbosity_wrap(1)
libsvm_sparse.set_verbosity_wrap(1)
in svm/base.py
maybe we could add a verbose flag to SVM estimators.
Alex
--
Virtualization & Cloud Manageme
2012/2/14 David Warde-Farley :
> On Tue, Feb 14, 2012 at 10:54:34PM +0100, Alexandre Gramfort wrote:
>> hi Ian,
>>
>> yes you're right, however the dev version should be up to date:
>>
>> http://scikit-learn.org/dev/modules/svm.html#svc
>>
>> see note on scale_C parameter.
>
> Relatedly, on the sub
On Tue, Feb 14, 2012 at 10:54:34PM +0100, Alexandre Gramfort wrote:
> hi Ian,
>
> yes you're right, however the dev version should be up to date:
>
> http://scikit-learn.org/dev/modules/svm.html#svc
>
> see note on scale_C parameter.
Relatedly, on the subject of comparing SVM implementations, i
hi Ian,
yes you're right, however the dev version should be up to date:
http://scikit-learn.org/dev/modules/svm.html#svc
see note on scale_C parameter.
Alex
On Tue, Feb 14, 2012 at 10:49 PM, Ian Goodfellow
wrote:
> This page says that SVC minimizes
> sum(square(w)) + C * sum(margins)
>
> I be
This page says that SVC minimizes
sum(square(w)) + C * sum(margins)
I believe it actually minimizes
sum(square(w)) + C * mean(margins)
This may seem like a pedantic distinction but it's very important for
being able to compare sklearn to other svm implementations.
Am I correct that SVC minimizes
49 matches
Mail list logo