I read through 1249 and had some initial questions before coming up with a 
plan, I was looking through the ParallelALSFactorizationJob.java and am 
assuming this is the right place to make all the changes, to this end:
1) I was thinking of introducing convergence training error as another 
parameter to be specified as a configuration parameter to replace the number of 
iterations2) For the chunk of code below:
for (int currentIteration = 0; currentIteration < numIterations; 
currentIteration++) {      /* broadcast M, read A row-wise, recompute U 
row-wise */      log.info("Recomputing U (iteration {}/{})", currentIteration, 
numIterations);      runSolver(pathToUserRatings(), pathToU(currentIteration), 
pathToM(currentIteration - 1), currentIteration, "U",                numItems); 
     /* broadcast U, read A' row-wise, recompute M row-wise */      
log.info("Recomputing M (iteration {}/{})", currentIteration, numIterations);   
   runSolver(pathToItemRatings(), pathToM(currentIteration), 
pathToU(currentIteration), currentIteration, "M",                numUsers);    }

I am proposing we have a while loop similar to the following:
while (currentTrainingError<=specifiedTrainingErrorForConvergence) { /* 
broadcast M, read A row-wise, recompute U row-wise */      
log.info("Recomputing U (iteration {}/{})", currentIteration, numIterations);   
   runSolver(pathToUserRatings(), pathToU(currentIteration), 
pathToM(currentIteration - 1), currentIteration, "U",                numItems); 
     /* broadcast U, read A' row-wise, recompute M row-wise */      
log.info("Recomputing M (iteration {}/{})", currentIteration, numIterations);   
   runSolver(pathToItemRatings(), pathToM(currentIteration), 
pathToU(currentIteration), currentIteration, "M",                numUsers);}
However I am wondering where or how I would compute the training error each 
time, would that happen inside runSolver or be an artifact of performing the 
solverComputation, pardon my ignorance on this, also I wanted to get deeper 
insight into ALS, is the following the best paper to read:
http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf.
Specifically I am trying to understand where the training error comes into play 
within the SVD computation.
Really would appreciate some more insight as I explore and dig through the code.
Regards

> Date: Tue, 7 Jan 2014 09:11:17 +0100
> From: [email protected]
> To: [email protected]
> Subject: Re: JIRA issues 1248/1249
> 
> Hi Saikat,
> 
> I suggest to start with 1249, which is the easier task. The best way to
> proceed is by discussing on the mailinglist. Have a look at the issue,
> propose a solution here and wait for our feedback.
> 
> Best,
> Sebastian
> 
> On 07.01.2014 04:27, Saikat Kanjilal wrote:
> > Sebastien et al,After months of not having bandwidth to help out with 
> > coding tasks I am finally ready to help with the implementation of the 
> > above JIRA issues, before I begin I wanted to make sure these improvements 
> > are still needed for ALS, I am targeting to finish these by the 1.0 
> > release.   Also if these are relevant should I just present a design/plan 
> > of implementation?  I'd love some initial guidance and thoughts around 
> > these tasks, feel free to add them to the tickets themselves.Thanks in 
> > advance.                                          
> > 
> 
                                          

Reply via email to