Re: distributed similarity calculation for CF

bejoy ks Mon, 14 Nov 2011 10:48:55 -0800

Hey Chris
         It is well described in the book Mahout in Action. In case of item 
based distributed recommendations on a high level, it follows the sequence of 
steps
-Creates co-occurence matrix
       It is a measure of the num of times two items together in some 
preferences. The matrix is a square matrix which computes the same for all 
combination of item pairs.
-Creates  User Vector
       It is kind of an n dimensional vector each dimension for an item
- Generate recommendation
     Multiply the co-occurence matrix with user Vector(as a colum vector)


As I mentioned for a detailed dive in Mahout in Action is the best companion.

Hope It helps!..

Regards
Bejoy K S

-----Original Message-----
From: Chris Schilling <[email protected]>
Date: Mon, 14 Nov 2011 18:24:57 
To: <[email protected]>
Subject: distributed similarity calculation for CF

Hi All,

I was just curious if the job flow for the distributed similarity calculation 
is documented anywhere.  What is the difference between calculating a 
similarity sequentially versus using distributed matrix operations on Hadoop.  
I am just looking for a high level description of how to get from the User-Item 
matrix to a Item Item similarity score in map-reduce.

Thanks!
Chris

Re: distributed similarity calculation for CF

Reply via email to