Thanks Ted. Read the paper and the code and got the rough idea of how the iteration goes. Thanks so much.
With the current data scale we have, we were considering if we could train more data with the Logistic Regression. For example, if we wanted to train a model for CTR prediction for last 90 days data. It would be 900M records after down sampling, and assume there are 1000 feature dimension there. It would still be so slow by a single machine with the current SGD algorithm. I wondering if there is a parallel algorithm with map-reduce I could use for Logistic Regression? The original Newton-Raphson will take N*N*M/P by the "Map-Reduce for Machine Learning on Multicore" paper, which is much slower than SGD on a single machine in a high-dimension space. Could algorithm like IRLS be parallelized or any approximate algorithm there could be parallelized? Thanks, Stanley Xu On Mon, Apr 25, 2011 at 11:58 PM, Ted Dunning <[email protected]> wrote: > Paul K described in memory algorithms in his dissertation. Mahout uses > on-line algorithms which are not limited by memory size. > > The method used in Mahout is closer to what Bob Carpenter describes here: > http://lingpipe.files.wordpress.com/2008/04/lazysgdregression.pdf > > The most important additions in Mahout are: > > a) confidence weighted learning rates per term > > b) evolutionary tuning of hyper-parameters > > c) mixed ranking and regression > > d) grouped AUC > > On Mon, Apr 25, 2011 at 6:12 AM, Stanley Xu <[email protected]> wrote: > > > Dear All, > > > > I am trying to go through the Mahout SGD algorithm and trying to read > > the "Logistic > > Regression for Data Mining and High-Dimensional Classification" a little > > bit, I am wondering which algorithm is exactly used in the SGD code? > There > > are quite a couple of algorithms mentioned in the paper, a little hard to > > me > > to find out the algorithm matched the code. > > > > Thanks in advance. > > > > Best wishes, > > Stanley Xu > > >
