Hey Chris
It is well described in the book Mahout in Action. In case of item
based distributed recommendations on a high level, it follows the sequence of
steps
-Creates co-occurence matrix
It is a measure of the num of times two items together in some
preferences. The matrix is a square matrix which computes the same for all
combination of item pairs.
-Creates User Vector
It is kind of an n dimensional vector each dimension for an item
- Generate recommendation
Multiply the co-occurence matrix with user Vector(as a colum vector)
As I mentioned for a detailed dive in Mahout in Action is the best companion.
Hope It helps!..
Regards
Bejoy K S
-----Original Message-----
From: Chris Schilling <[email protected]>
Date: Mon, 14 Nov 2011 18:24:57
To: <[email protected]>
Subject: distributed similarity calculation for CF
Hi All,
I was just curious if the job flow for the distributed similarity calculation
is documented anywhere. What is the difference between calculating a
similarity sequentially versus using distributed matrix operations on Hadoop.
I am just looking for a high level description of how to get from the User-Item
matrix to a Item Item similarity score in map-reduce.
Thanks!
Chris