Oh, I think you have to use trunk/ not /trunk/ Maybe that helps. 2013/7/31 Kun Yang (JIRA) <[email protected]>
> > [ > https://issues.apache.org/jira/browse/MAHOUT-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725782#comment-13725782] > > Kun Yang commented on MAHOUT-1273: > ---------------------------------- > > I use git command. The patch works on my machine. > git diff HEAD^1 > ~/Downloads/PenalizedLinearRegression.patch > > > Single Pass Algorithm for Penalized Linear Regression with Cross > Validation on MapReduce > > > ---------------------------------------------------------------------------------------- > > > > Key: MAHOUT-1273 > > URL: https://issues.apache.org/jira/browse/MAHOUT-1273 > > Project: Mahout > > Issue Type: New Feature > > Affects Versions: 0.9 > > Reporter: Kun Yang > > Labels: documentation, features, patch, test > > Fix For: 0.9 > > > > Attachments: Algorithm and Numeric Stability.pdf, java > files.pdf, Manual and Example.pdf, PenalizedLinear.pdf, > PenalizedLinearRegression.patch > > > > Original Estimate: 720h > > Remaining Estimate: 720h > > > > Penalized linear regression such as Lasso, Elastic-net are widely used > in machine learning, but there are no very efficient scalable > implementations on MapReduce. > > The published distributed algorithms for solving this problem is either > iterative (which is not good for MapReduce, see Steven Boyd's paper) or > approximate (what if we need exact solutions, see Paralleled stochastic > gradient descent); another disadvantage of these algorithms is that they > can not do cross validation in the training phase, which requires a > user-specified penalty parameter in advance. > > My ideas can train the model with cross validation in a single pass. > They are based on some simple observations. > > The core algorithm is a modified version of coordinate descent (see J. > Freedman's paper). They implemented a very efficient R package "glmnet", > which is the de facto standard of penalized regression. > > I have implemented the primitive version of this algorithm in Alpine > Data Labs. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira >
