[ 
https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800248#action_12800248
 ] 

Ted Dunning commented on MAHOUT-228:
------------------------------------


We need a few things:

- a few functions should be separated out for more general utitlity

- the random vectorizer should be generalized a bit

- we need some real world testing.  20 newsgroups would be a good test as would 
be rcv1.  Cloning the new svm package's tests would probably be the best 
short-term answer.

I, unfortunately, won't have time for a week or two to followup.

As such, perhaps the best step is to commit this now.  It won't break anything.
 

> Need sequential logistic regression implementation using SGD techniques
> -----------------------------------------------------------------------
>
>                 Key: MAHOUT-228
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-228
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Ted Dunning
>             Fix For: 0.3
>
>         Attachments: logP.csv, MAHOUT-228-3.patch, r.csv, sgd-derivation.pdf, 
> sgd-derivation.tex, sgd.csv
>
>
> Stochastic gradient descent (SGD) is often fast enough for highly scalable 
> learning (see Vowpal Wabbit, http://hunch.net/~vw/).
> I often need to have a logistic regression in Java as well, so that is a 
> reasonable place to start.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to