[Scikit-learn-general] Using an SVM with a chi-squared kernel

2012-09-06 Thread George Jung
Hi everybody, I have been trying to figure this out but it's unclear to me. I have a set of data, where each data point is represented by a histogram. I want to apply the chi-squared kernel, shown at http://scikit-learn.org/0.11/modules/kernel_approximation.html#additive-chi-squared-kernelfor my

Re: [Scikit-learn-general] Multi-class sparse data

2012-09-06 Thread Olivier Grisel
2012/9/6 Ark : > >> Hand how large in bytes? It seems that is should be small enough to be >> able to use sklearn.linear_model.LogisticRegression despite the data >> copy in memory. >> > > Right now its not even 100M, but it will extend to 1G atleast. Alright, have you tried sklearn.linear_model.L

Re: [Scikit-learn-general] Multi-class sparse data

2012-09-06 Thread Ark
> Hand how large in bytes? It seems that is should be small enough to be > able to use sklearn.linear_model.LogisticRegression despite the data > copy in memory. > Right now its not even 100M, but it will extend to 1G atleast.

Re: [Scikit-learn-general] ANN: scikit-learn 0.12

2012-09-06 Thread Yaroslav Halchenko
ok https://github.com/scikit-learn/scikit-learn/issues/1121 On Thu, 06 Sep 2012, Andreas Müller wrote: > Hi Yaroslav. > Thanks for investigating. > This is a bit unexpected as it seems to work on jenkins. > Maybe this is 32bit? > We could also try to make the test more robust. > Nelle has been

Re: [Scikit-learn-general] ANN: scikit-learn 0.12

2012-09-06 Thread Nelle Varoquaux
Hello ! We could also try to make the test more robust. > Nelle has been looking into the algorithms a bit lately, > maybe she has some ideas. > No idea: I haven't tackled this one yet. > > Otherwise just skip the test for the moment and we can open > an issue and try to fix it for the future.

Re: [Scikit-learn-general] ANN: scikit-learn 0.12

2012-09-06 Thread Andreas Müller
Hi Yaroslav. Thanks for investigating. This is a bit unexpected as it seems to work on jenkins. Maybe this is 32bit? We could also try to make the test more robust. Nelle has been looking into the algorithms a bit lately, maybe she has some ideas. Otherwise just skip the test for the moment and w

Re: [Scikit-learn-general] ANN: scikit-learn 0.12

2012-09-06 Thread Yaroslav Halchenko
I am consistently getting == ERROR: sklearn.cluster.tests.test_spectral.test_affinities -- Traceback (most recent call last): File "/usr/lib/pymodules/python2.6

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Vlad Niculae
On Sep 6, 2012, at 18:08 , Mathieu Blondel wrote: > Hello, > > The Perceptron can be seen as a SGD algorithm optimizing the loss \sum_i > max{t - y_i w^T x_i, 0} where t=0. On the other hand, online SVM optimizes > the same loss but with t=1 (the advantage of setting t=1 rather than t=0 is >

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Mathieu Blondel
Hello, The Perceptron can be seen as a SGD algorithm optimizing the loss \sum_i max{t - y_i w^T x_i, 0} where t=0. On the other hand, online SVM optimizes the same loss but with t=1 (the advantage of setting t=1 rather than t=0 is that it then upper-bounds the zero-one loss). You can check that o

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Jaidev Deshpande
On Thu, Sep 6, 2012 at 6:08 PM, Peter Prettenhofer wrote: > you can find the learning procedure in > https://github.com/jaidevd/scikit-learn/blob/master/sklearn/linear_model/sgd_fast.pyx#L377 > . > > This is a cython [1] extension module that gets translated into C code > (see sgd_fast.c) > > [1]

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Jaidev Deshpande
On Thu, Sep 6, 2012 at 5:33 PM, Lars Buitinck wrote: > 2012/9/6 Jaidev Deshpande : >> I've been playing around with the Perceptron class in scikit-learn. I >> have a theoretical understanding of the perceptron algorithm. In >> sklearn it has been subclassed from the SGDClassifier class, very >> di

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Peter Prettenhofer
you can find the learning procedure in https://github.com/jaidevd/scikit-learn/blob/master/sklearn/linear_model/sgd_fast.pyx#L377 . This is a cython [1] extension module that gets translated into C code (see sgd_fast.c) [1] http://cython.org/ 2012/9/6 Jaidev Deshpande : > On Thu, Sep 6, 2012 at

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Jaidev Deshpande
On Thu, Sep 6, 2012 at 5:56 PM, Jaidev Deshpande wrote: > On Thu, Sep 6, 2012 at 5:50 PM, Vlad Niculae wrote: >> I think that the "tweaks" our implementation has are vital for real world >> use. However the perceptron is "textbook" and it would be nice to have a >> simple way to reproduce the s

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Andreas Müller
Lars pointed out above the exact place where thee update is applied. Imbalanced datasets mean on class much more frequent than the other. - Ursprüngliche Mail - Von: "Jaidev Deshpande" An: [email protected] Gesendet: Donnerstag, 6. September 2012 13:26:09 Betref

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Lars Buitinck
2012/9/6 Vlad Niculae : > I think that the "tweaks" our implementation has are vital for real world > use. However the perceptron is "textbook" and it would be nice to have a > simple way to reproduce the simple version. Is it just a question of init > parameters? Judging from the code, it seem

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Jaidev Deshpande
On Thu, Sep 6, 2012 at 5:50 PM, Vlad Niculae wrote: > I think that the "tweaks" our implementation has are vital for real world > use. However the perceptron is "textbook" and it would be nice to have a > simple way to reproduce the simple version. Is it just a question of init > parameters? P

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Vlad Niculae
I think that the "tweaks" our implementation has are vital for real world use. However the perceptron is "textbook" and it would be nice to have a simple way to reproduce the simple version. Is it just a question of init parameters? -- Vlad N. http://vene.ro

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Lars Buitinck
2012/9/6 Jaidev Deshpande : > I've been playing around with the Perceptron class in scikit-learn. I > have a theoretical understanding of the perceptron algorithm. In > sklearn it has been subclassed from the SGDClassifier class, very > different from how I would have expected the perceptron to be

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Peter Prettenhofer
2012/9/6 Andreas Müller : > Hi Jaidev. > I think it is ok to discuss on the list. I agree - if we come up with improvements we can open an issue an discuss code modifications in more detail. -- Peter Prettenhofer -- Live

Re: [Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Andreas Müller
Hi Jaidev. I think it is ok to discuss on the list. I didn't implement the Perceptron but I think it is basically "as simple" as the one on the wikipedia page - efficiency and dealing with sparse / dense data make the code a bit longer, though ;) It is a stochastic gradient decent procedure (mean

[Scikit-learn-general] Conceptual questions about linear_model.perceptron

2012-09-06 Thread Jaidev Deshpande
Hello, I've been playing around with the Perceptron class in scikit-learn. I have a theoretical understanding of the perceptron algorithm. In sklearn it has been subclassed from the SGDClassifier class, very different from how I would have expected the perceptron to be implemented (I'd have though