Andy writes:
> I would venture that which one is better would depend on the nature of
> your data.
> Do you know the number of types beforehand? And do all types have 1000
> categories?
The number of Types is defined, however the number of categories keeps
increasing...but as I see it is un
On 11/06/2013 05:00 PM, Jim wrote:
>> Then you don't need a OneVsRestClassifier as OvR is the default
>> strategy for SGD. You do need to put a maximum on the number of
>> classes before you start learning, though.
> I see. Thank you for the advice. This was initial novice iteration of the
> so
[Mistakenly posted as separate thread before, please ignore previous post]
I see. Thank you for the advice. This was initial novice iteration of the
solution and needs improvement of course. In terms of which, in order to
keep the behaviour of the classifier consistent, instead of a single
cla
> Then you don't need a OneVsRestClassifier as OvR is the default
> strategy for SGD. You do need to put a maximum on the number of
> classes before you start learning, though.
I see. Thank you for the advice. This was initial novice iteration of the
solution and needs improvement of course. In t
2013/11/6 Jim <[email protected]>:
> No I am primarily working on multiclass classification with constantly
> increasing number of classes
Then you don't need a OneVsRestClassifier as OvR is the default
strategy for SGD. You do need to put a maximum on the number of
classes before you start learni
>
> > 2) Will there be a interface for online learning in OnevsRestClassifier?
>
> If you care to implement it, then we're happy to accept a patch. Are
> you doing multi-label classification?
No I am primarily working on multiclass classification with constantly
increasing number of classes
--
2013/11/5 Jim <[email protected]>:
> 1) This might sound like a basic question, but when performing a partial_fit
> in SGD Classifiers, does the new data to train on have to be in the
> categories that are already in the set? Or conversely, if I come across/ want
> to add a few documents(in a docum
1) This might sound like a basic question, but when performing a partial_fit
in SGD Classifiers, does the new data to train on have to be in the
categories that are already in the set? Or conversely, if I come across/ want
to add a few documents(in a document classifier example) in a new categor
Olivier Grisel writes:
> I don't know if there is any maximum file size on the gists. Just try
> and if it fails use something else such as dropbox public folder or
> Amazon S3 if you have an account.
>
Adding all the files to https://github.com/abhirk/LinearSVC.
Thanks.
2012/7/31 Abhi :
> Abhi writes:
>
>>
>> Olivier Grisel writes:
>>
>
>> > Could you please try to come up with one or two minimalistic
>> > reproduction scripts for the ch2.fit_transform and LinearSVC.fit
>> > segfaults? Is it just that it is exhausting memory on your system? Are
>> > you running
Abhi writes:
>
> Olivier Grisel writes:
>
> > Could you please try to come up with one or two minimalistic
> > reproduction scripts for the ch2.fit_transform and LinearSVC.fit
> > segfaults? Is it just that it is exhausting memory on your system? Are
> > you running a 32bit or a 64bit OS? How
Olivier Grisel writes:
>
> 2012/7/25 Abhi :
> >
> > Hello,
> > Sorry for getting back late..I originally had experimented with
different
> > classifiers including SGDClassifier, it seemed faster but much less
accurate,
> > about 93% for 3 emails[and decreasing as the number of emails
2012/7/25 Abhi :
>
> Hello,
> Sorry for getting back late..I originally had experimented with different
> classifiers including SGDClassifier, it seemed faster but much less accurate,
> about 93% for 3 emails[and decreasing as the number of emails increases],
> but have not tried with the i
ers,
> Andy
>
> - Ursprüngliche Mail -
> Von: "Fred Mailhot" gmail.com>
> An: scikit-learn-general lists.sourceforge.net
> Gesendet: Samstag, 14. Juli 2012 22:14:51
> Betreff: Re: [Scikit-learn-general] Online learning
>
> On 14 July 2012 04:22,
Mailhot"
An: [email protected]
Gesendet: Samstag, 14. Juli 2012 22:14:51
Betreff: Re: [Scikit-learn-general] Online learning
On 14 July 2012 04:22, Olivier Grisel < [email protected] > wrote:
2012/7/13 Abhi < [email protected] >:
> Hell
On 14 July 2012 04:22, Olivier Grisel wrote:
> 2012/7/13 Abhi :
> > Hello,
> >My problem is to classify a set of 200k+ emails into approx. 2800
> categories.
> > Currently the method I am using is calculating tfidf and using
> LinearSVC()
> > [with a good accuracy of 98%] for classification
On Sat, Jul 14, 2012 at 8:22 PM, Olivier Grisel wrote:
>
> LinearSVC is based on liblinear that only implements batch
> optimization. Instead you can use SGDClassifier that features
> partial_fit method that you can call several consecutive times on
> chunks of data for incremental learning.
You
2012/7/13 Abhi :
> Hello,
>My problem is to classify a set of 200k+ emails into approx. 2800
> categories.
> Currently the method I am using is calculating tfidf and using LinearSVC()
> [with a good accuracy of 98%] for classification. The training time is ~30-60
> min [~16g of mem, and dou
Hello,
My problem is to classify a set of 200k+ emails into approx. 2800 categories.
Currently the method I am using is calculating tfidf and using LinearSVC()
[with a good accuracy of 98%] for classification. The training time is ~30-60
min [~16g of mem, and doubles every 75000 mails]. I wa
hi andy,
we've already had a lot of discussion on the online learning problem
and the partial_fit.
Maybe someone feels like summarizing but otherwise you should search the
mailing list archive for "online" and "partial_fit".
Alex
On Sun, Mar 11, 2012 at 2:02 PM, Andreas Mueller
wrote:
> Hi ever
Hi everybody.
There has been some talk about online learning API on the list
but I am not really sure what is meant by that. Could someone please
clarify what the applications are that you have in mind and what
features you would like to have?
In my mind, online learning is pretty close to the "pa
21 matches
Mail list logo