On Tue, Mar 20, 2012 at 5:10 PM, Andreas <[email protected]> wrote:
> On 03/20/2012 10:07 PM, James Bergstra wrote:
>> So recently I wrote this code:
>> https://github.com/jaberg/asgd/blob/early_stopping/asgd/linsvm.py
>>
>> My intent with this class was to provide a sklearn-like interface to
>> train linear SVMs, but which would have automatic selection logic to
>> handle various problem dimensions, which call for different
>> algorithms:
>> * if you have more features than examples, you should use a
>> gram-matrix algorithm,
>> * if you don't then you should use an sgd-type algorithm
>> * if you have more than two classes, you should use a larank-type
>> algorithm (i think?), but ...
>> * if you have to use a gram-matrix algorithm for efficiency then I
>> wonder if maybe you can't do larank so you should use a one-vs-all
>> approach (or one vs. one?).
>>
>> Anyway this code uses SVC in some cases, and uses @npinto's asgd code
>> in other cases, and uses some of my code in others... but I have a
>> feeling that I'm reinventing a wheel here, is there something in
>> sklearn that already does this type of thing?
>>
> Hi James.
> I am afraid not. There is no automatic choice between
> different algorithms, only between those included in LibLinear.
> I think that switches between primal and dual depending
> on the problem. I'm not sure how multi-class is handled there.
> But there are no "smart" ways to do SGD (yet) afaik.
>
> Could you please explain your different approaches?
> I would be very interested in what your choices would be
> for the different cases and why.

I'm not claiming any approaches for myself, I'm just trying to pick
between primal (e.g. http://leon.bottou.org/projects/sgd or @npinto's
asgd code) and dual approaches (e.g. libsvm).

They are both solving the same problem, so they should be
mathematically interchangeable solution algorithms right? It's just
that for some problems one will be faster, and for others, another.
This choice doesn't seem like it involves any judgement on the user's
part... it's just a matter of estimating how long each one will take
from variables such as the problem dimension, and perhaps the expected
training set error rate.

I'm also interested in extending this sort of heuristic to hide the
distinction between 2-class and multi-class, but it's admittedly
somewhat orthogonal.

- James

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to