>From what I understand Mahout-371 tries to address the DistributedSVDRecommenderJob. Is it fully ready for use?
@Sebastian : The above recommender uses the DistributedLanczosSolver to achieve the SVD. So, should the distributed Matrix Factorization(Mahout-542) you were talking about be integrated with it instead? I am slightly confused.... On Fri, Nov 19, 2010 at 4:32 PM, Ted Dunning <[email protected]> wrote: > On Fri, Nov 19, 2010 at 2:27 PM, Sebastian Schelter <[email protected]> > wrote: > > > Can I use the new LanczosSolver to > > >> achieve this? > > > > The paper "Large-scale Parallel Collaborative Filtering for the Netflix > > Prize" says that you can't use Lanczos to factorize a rating matrix as > > it is only partially specified. However someone with more mathematical > > expertise than me should validate that statement, hope I didn't get that > > wrong :) > > > > You correctly quoted the statement. But I don't think that the statement > is > entirely > correct. The difference in practice isn't all that big a deal. > > > > Ted is working on LatentFactorLogLinear models in MAHOUT-525 which can > > be used for recommendations too and should be superior to the approach > > of MAHOUT-542. They're not distributed but in the paper in which they > > are described the authors state that they could train the 1M Movielens > > Dataset in 7 minutes so they should be fast enough for your testcase. > > > > This is where I would push for recommendations. I have a preliminary > implementation > available on github, but I don't think it is ready to commit. It does do > roughly what it > is supposed to do (on one test) but I don't have enough runtime with it to > have any > level of confidence yet. >
