Hi Saikat,

I would suggest the following: compute the training error in the mapper that recomputes M in step 3 after the item vectors are recomputed. Find an efficient way to aggregate the errors from the mapper e.g. via Hadoop's counters and let the driver check for convergence.

It's ok to give the users an option to specify a threshold for the convergence of the error, but we should provide a reasonable default.

Does this answer your questions?

Best,
Sebastian

On 01/16/2014 04:46 PM, Saikat Kanjilal wrote:
Sebastien,Can I get some feedback on my plan otulined below, I'm going to get 
started with a design and put it on the JIRA ticket in the interim.
Thanks

From: [email protected]
To: [email protected]
Subject: RE: JIRA issues 1248/1249
Date: Thu, 9 Jan 2014 21:47:06 -0800




Some more clarifications:
http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf
It seems like we can pretty much follow the strategy below:
Step 1 Initialize matrix M by assigning the average rating asthe first row, and 
small random numbers for the remaining entries.
Step 2 Fix M, Solve U by minimizing the objective function (the sum ofsquared 
errors);
Step 3 Fix U, solve M by minimizing the objective function similarly;
Step 4 Repeat Steps 2 and 3 until a stopping criterion is satisfied.

The stopping criterion in this case is where the objective function 
minimization has happened within the RMSE limits specified
Again the RMSE limit would be specified as a configuration parameter instead of 
the number of iterations.
Sebastien et al, would love to get some feedback on my approach.
From: [email protected]
To: [email protected]
Subject: RE: JIRA issues 1248/1249
Date: Wed, 8 Jan 2014 20:17:04 -0800

I read through 1249 and had some initial questions before coming up with a 
plan, I was looking through the ParallelALSFactorizationJob.java and am 
assuming this is the right place to make all the changes, to this end:
1) I was thinking of introducing convergence training error as another 
parameter to be specified as a configuration parameter to replace the number of 
iterations2) For the chunk of code below:
for (int currentIteration = 0; currentIteration < numIterations; currentIteration++) {      /* broadcast M, read A 
row-wise, recompute U row-wise */      log.info("Recomputing U (iteration {}/{})", currentIteration, 
numIterations);      runSolver(pathToUserRatings(), pathToU(currentIteration), pathToM(currentIteration - 1), 
currentIteration, "U",                numItems);      /* broadcast U, read A' row-wise, recompute M row-wise */  
    log.info("Recomputing M (iteration {}/{})", currentIteration, numIterations);      
runSolver(pathToItemRatings(), pathToM(currentIteration), pathToU(currentIteration), currentIteration, "M",      
          numUsers);    }

I am proposing we have a while loop similar to the following:
while (currentTrainingError<=specifiedTrainingErrorForConvergence) { /* broadcast M, read A row-wise, recompute U 
row-wise */      log.info("Recomputing U (iteration {}/{})", currentIteration, numIterations);      
runSolver(pathToUserRatings(), pathToU(currentIteration), pathToM(currentIteration - 1), currentIteration, "U",  
              numItems);      /* broadcast U, read A' row-wise, recompute M row-wise */      log.info("Recomputing M 
(iteration {}/{})", currentIteration, numIterations);      runSolver(pathToItemRatings(), pathToM(currentIteration), 
pathToU(currentIteration), currentIteration, "M",                numUsers);}
However I am wondering where or how I would compute the training error each 
time, would that happen inside runSolver or be an artifact of performing the 
solverComputation, pardon my ignorance on this, also I wanted to get deeper 
insight into ALS, is the following the best paper to read:
http://www.hpl.hp.com/personal/Robert_Schreiber/papers/2008%20AAIM%20Netflix/netflix_aaim08(submitted).pdf.
Specifically I am trying to understand where the training error comes into play 
within the SVD computation.
Really would appreciate some more insight as I explore and dig through the code.
Regards

Date: Tue, 7 Jan 2014 09:11:17 +0100
From: [email protected]
To: [email protected]
Subject: Re: JIRA issues 1248/1249

Hi Saikat,

I suggest to start with 1249, which is the easier task. The best way to
proceed is by discussing on the mailinglist. Have a look at the issue,
propose a solution here and wait for our feedback.

Best,
Sebastian

On 07.01.2014 04:27, Saikat Kanjilal wrote:
Sebastien et al,After months of not having bandwidth to help out with coding 
tasks I am finally ready to help with the implementation of the above JIRA 
issues, before I begin I wanted to make sure these improvements are still 
needed for ALS, I am targeting to finish these by the 1.0 release.   Also if 
these are relevant should I just present a design/plan of implementation?  I'd 
love some initial guidance and thoughts around these tasks, feel free to add 
them to the tickets themselves.Thanks in advance.                               
     


                                        
                                                                                


Reply via email to