2011/12/6 Andreas Mueller :
> On 12/06/2011 04:55 AM, Gael Varoquaux wrote:
>> On Mon, Dec 05, 2011 at 10:54:42PM +0100, Olivier Grisel wrote:
>>> - libsvm uses SMO (a dual solver) and supports non-linear kernels and
>>> has complexity ~ n_samples^3 hence cannot scale to large n_samples
>>> (e.g. m
On 12/06/2011 04:55 AM, Gael Varoquaux wrote:
> On Mon, Dec 05, 2011 at 10:54:42PM +0100, Olivier Grisel wrote:
>> - libsvm uses SMO (a dual solver) and supports non-linear kernels and
>> has complexity ~ n_samples^3 hence cannot scale to large n_samples
>> (e.g. more than 50k).
>> - liblinear uses
On Mon, Dec 05, 2011 at 10:54:42PM +0100, Olivier Grisel wrote:
> - libsvm uses SMO (a dual solver) and supports non-linear kernels and
> has complexity ~ n_samples^3 hence cannot scale to large n_samples
> (e.g. more than 50k).
> - liblinear uses some kind of fancy coordinate descent (primal or du
2011/12/5 Alexandre Gramfort :
> look at
>
> sklearn.multiclass
Indeed, these tools allows the user to build a meta learner with any
multiclass logic on top of a binary classifier implementations (hence
both LinearSVC and SVC can be used as the underlying binary classifier
implementations).
htt
2011/12/5 Ian Goodfellow :
>
> ok, I was using LinearSVC, so I guess I am still not using the dense
> implementation.
>
> Is there a way to use one-against-rest rather than one-against-many
> classification with the SVC class?
What is one-against-many? SVC mutliclass support comes directly from
th
look at
sklearn.multiclass
Alex
On Mon, Dec 5, 2011 at 10:37 PM, Ian Goodfellow
wrote:
> On Mon, Dec 5, 2011 at 4:24 PM, Olivier Grisel
> wrote:
>> 2011/12/5 Ian Goodfellow :
>>> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel
>>> wrote:
2011/12/2 Ian Goodfellow :
> On Fri, Oct 7, 2
On Mon, Dec 5, 2011 at 4:24 PM, Olivier Grisel wrote:
> 2011/12/5 Ian Goodfellow :
>> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel
>> wrote:
>>> 2011/12/2 Ian Goodfellow :
On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel
wrote:
> 2011/10/7 Ian Goodfellow :
>> Thanks. Yes it d
2011/12/5 Ian Goodfellow :
> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel
> wrote:
>> 2011/12/2 Ian Goodfellow :
>>> On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel
>>> wrote:
2011/10/7 Ian Goodfellow :
> Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
>
hello ian,
can you show a snippet of the code you use to train your svm?
and give us the dimensions of your problem?
Alex
On Mon, Dec 5, 2011 at 9:51 PM, Ian Goodfellow wrote:
> On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel
> wrote:
>> 2011/12/2 Ian Goodfellow :
>>> On Fri, Oct 7, 2011 at 5:
On Fri, Dec 2, 2011 at 3:36 AM, Olivier Grisel wrote:
> 2011/12/2 Ian Goodfellow :
>> On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel
>> wrote:
>>> 2011/10/7 Ian Goodfellow :
Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
so this memory usage is normal/caused
2011/12/2 Ian Goodfellow :
> On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel
> wrote:
>> 2011/10/7 Ian Goodfellow :
>>> Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
>>> so this memory usage is normal/caused by the implementation of liblinear.
>>>
>>> You may want to u
On Fri, Oct 7, 2011 at 5:14 AM, Olivier Grisel wrote:
> 2011/10/7 Ian Goodfellow :
>> Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
>> so this memory usage is normal/caused by the implementation of liblinear.
>>
>> You may want to update the documentation hosted at thi
On Fri, Oct 7, 2011 at 5:07 AM, Olivier Grisel wrote:
> 2011/10/7 Ian Goodfellow :
>> I understand that LinearSVC is implemented using liblinear, which I thought
>> should work well with large datasets. However, when I pass LinearSVC.fit a
>> design matrix of size 40,000 x 14,400 (in float32 forma
2011/10/7 Ian Goodfellow :
> Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
> so this memory usage is normal/caused by the implementation of liblinear.
>
> You may want to update the documentation hosted at this site:
> http://scikit-learn.sourceforge.net/modules/svm.htm
2011/10/7 Mathieu Blondel :
> By the way, I suspect that that predict method is also sub-optimal
> because, since the support vectors and the coefficients are stored in
> numpy arrays or scipy matrices, predict has to make the conversion to
> liblinear's model structure at every call. This is the p
By the way, I suspect that that predict method is also sub-optimal
because, since the support vectors and the coefficients are stored in
numpy arrays or scipy matrices, predict has to make the conversion to
liblinear's model structure at every call. This is the price that we
currently pay for pickl
For dense-data, I recommend SGDClassifier or SVC if you want to use a kernel.
I'm thinking that in the mid-term we may want to ship our own Cython
implementation of liblinear (from what I saw, it didn't seem that hard
to implement).
Mathieu
---
2011/10/7 Gael Varoquaux :
> On Fri, Oct 07, 2011 at 08:44:53AM +, [email protected] wrote:
>> I just wanted to say that we have similar problems in our lab
>> which we "solved" by buying more RAM.
>> It would be great to have single precision implementations
>> of both SGDClassifier and Linear
2011/10/7 :
> We wrestled with exactly this issue for decision trees, so its clear now that
> a general solution would be very beneficial to scikit-learn.
For liblinear it might be a bit complicated since both the C++ code
and the cython wrapper would have to be rewritten to generate the two
ver
On Fri, Oct 07, 2011 at 08:44:53AM +, [email protected] wrote:
> I just wanted to say that we have similar problems in our lab
> which we "solved" by buying more RAM.
> It would be great to have single precision implementations
> of both SGDClassifier and LinearSVC in scikits.learn.
SGDClassif
: [Scikit-learn-general] Memory consumption of LinearSVC.fit
> However I am pretty sure that it will force a copy of your data to be
> double precision (64bit).
As you suggested, this is the case for both LinearSVC and
SGDClassifier.
> If you install cython you can patch the
> source c
> However I am pretty sure that it will force a copy of your data to be
> double precision (64bit).
As you suggested, this is the case for both LinearSVC and
SGDClassifier.
> If you install cython you can patch the
> source code to force single precision instead.
>
> We might want to add support
Thanks. Yes it does appear that liblinear uses only a 64 bit dense format,
so this memory usage is normal/caused by the implementation of liblinear.
You may want to update the documentation hosted at this site:
http://scikit-learn.sourceforge.net/modules/svm.html#
It has a section on "avoiding da
If your data is really dense, then you should try to use the
SGDClassifier model instead of LinearSVC. It has an implementation for
dense numpy arrays hence will use twice as less memory as a sparse
representation.
However I am pretty sure that it will force a copy of your data to be
double precis
2011/10/7 Olivier Grisel :
>
> It would fix your issue though...
I meant: It would *not* fix your memory issue though...
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
--
All of the data generated in
2011/10/7 Ian Goodfellow :
> I understand that LinearSVC is implemented using liblinear, which I thought
> should work well with large datasets. However, when I pass LinearSVC.fit a
> design matrix of size 40,000 x 14,400 (in float32 format, so 2.3 gigabytes)
>
> it ends up using at least 8 additio
I don't know if it's relevant. But you should really try the newest version,
which is 0.9.
On Fri, Oct 7, 2011 at 10:52 AM, Ian Goodfellow wrote:
> I understand that LinearSVC is implemented using liblinear, which I thought
> should work well with large datasets. However, when I pass LinearSVC.fi
I understand that LinearSVC is implemented using liblinear, which I thought
should work well with large datasets. However, when I pass LinearSVC.fit a
design matrix of size 40,000 x 14,400 (in float32 format, so 2.3 gigabytes)
it ends up using at least 8 additional gigabytes of RAM!
I know that the
28 matches
Mail list logo