Hi,

I’ve been working with LLR in Mahout for a while now. Mostly using the 
SimilarityAnalysis.cooccurenceIDss function. I recently upgraded the Mahout 
libraries to 0.11, and subsequently also tried with 0.12 and the same program 
is running orders of magnitude slower (at least 3x based on initial analysis). 

Looking into the tasks more carefully, comparing 0.10 and 0.11 shows that the 
amount of Shuffle being done in 0.11 is significantly higher, especially in the 
AtB step. This could possibly be a reason for the reduction in performance. 

Although, I am working on Spark 1.2.0. So, its possible that this could be 
causing the problem. It works fine with Mahout 0.10. 

Any ideas why this might be happening?

Thank you,
Nikaash Puri

Reply via email to