Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hama Wiki" for change 
notification.

The following page has been changed by udanax:
http://wiki.apache.org/hama/TraditionalCollaborativeFiltering

------------------------------------------------------------------------------
+ [[TableOfContents(4)]]
+ ----
+ 
+ == Abstract ==
+ 
+ Collaborative filtering is an important personalized method in recommender 
systems in internet commerce. It is infeasible that traditional collaborative 
filtering is based on absolute rating for items since users are difficult to 
accurately make an absolute rating for items, and also different users give 
different rating distribution. In this tutorial, it shows that how to use a 
Hama to calculate TCF.
+ 
+ == Implementation ==
+ 
+ === Build a user by item matrix, Set entries from the raw data ===
+ 
+ ...
+ 
+ === Get the pairs of all row key combinations w/o repetition ===
+ We don't have to recalculate the same value pair with reversed order.[[BR]]
+ ~-ex) similar(UserA, UserB) == similar(UserB, UserA).-~
+ 
+ In this case, it is going to return {{1, 2}, {1, 3}, {2, 3}} by discarding 
{{2, 1}, {3, 1}, {3, 2}} from the full possible combination. Since there will 
be mC2 combinations (m : num keys), one can optimize it to have mC2 / N values 
per reducer (N : num-reducers). Something like :
+ 
+ {{{
+ partition(index i, key key_j, int N) { // N is num reducers
+  // find the data per reducer
+  int dataPerRed = mC2 / N; // assuming m is known
+  int prev_sum = 0;
+  // calculate the total combinations contributed by previous indexes
+  for (k=1; k < i; k++) {
+   prev_sum += m - k + 1; // this adds the number of combinations contributed 
by kth index
+  }
+  prev_sum += j - i + 1 // self contribution
+  return prev_sum % dataPerRed
+ }
+ }}}
+ 
+ === |a|·|b|cos(q) calculation ===
+ 
+ ...
+ 
+ === Collect the similarity result of the two users ===
+ 
+ ...
+ 
+ == Pseudo code for TCF with Hama ==
+ 
+ 
  {{{
  import java.math.BigInteger;
  
@@ -26, +70 @@

      }
  
      // 2. Get the pair set of all row key combinations
-     //  So, we don't have to recalculate the same value pair with reversed 
order.
-     //  ex) similar(UserA, UserB) == similar(UserB, UserA)
-     //  In this case, it is going to return {{1, 2}, {1, 3}, {2, 3}}
-     //   by discarding {{2, 1}, {3, 1}, {3, 2}} from the full possible 
combination.
      Combination x = new Combination(data.length, 2);
      
      // 3. |a|·|b|cos(q) calculation

Reply via email to