[ https://issues.apache.org/jira/browse/MAHOUT-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800248#action_12800248 ]
Ted Dunning commented on MAHOUT-228: ------------------------------------ We need a few things: - a few functions should be separated out for more general utitlity - the random vectorizer should be generalized a bit - we need some real world testing. 20 newsgroups would be a good test as would be rcv1. Cloning the new svm package's tests would probably be the best short-term answer. I, unfortunately, won't have time for a week or two to followup. As such, perhaps the best step is to commit this now. It won't break anything. > Need sequential logistic regression implementation using SGD techniques > ----------------------------------------------------------------------- > > Key: MAHOUT-228 > URL: https://issues.apache.org/jira/browse/MAHOUT-228 > Project: Mahout > Issue Type: New Feature > Components: Classification > Reporter: Ted Dunning > Fix For: 0.3 > > Attachments: logP.csv, MAHOUT-228-3.patch, r.csv, sgd-derivation.pdf, > sgd-derivation.tex, sgd.csv > > > Stochastic gradient descent (SGD) is often fast enough for highly scalable > learning (see Vowpal Wabbit, http://hunch.net/~vw/). > I often need to have a logistic regression in Java as well, so that is a > reasonable place to start. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.