Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/NewsPersonalizationSystem ------------------------------------------------------------------------------ == User clustering - MinHash == * Input: User and his clicked stories - * ~+S+~,,u,, = {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m} + * ~+S+~,,u,, = {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m,,} - * User similarity = | S,,u1,, I S,,u2,, | / |S,,u1,, Y S,,u2,, | + * User similarity = | S,,u1,, I S,,u2,, | / | S,,u1,, Y S,,u2,, | * Output: User clusters. * Similar users belong to same cluster === MinHash === * Randomly permute the universe of clicked stories - * {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m} = {s^'^,,1,, , s^'^,,2,, , ... , s^'^,,m,,} + * {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m,,} = {s^'^,,1,, , s^'^,,2,, , ... , s^'^,,m,,} * MH(u) = min(s^u^,,j,,) min defined by permutation - * P{MH(u,,1,,) = MH(u,,2,,)} = | S,,u1,, I S,,u2,, | / |S,,u1,, Y S,,u2,, | + * P{MH(u,,1,,) = MH(u,,2,,)} = | S,,u1,, I S,,u2,, | / | S,,u1,, Y S,,u2,, | * Pseudo-random permutation * Compute hash for each story and treat hash-value as permutation index * Treat !MinHash value as !ClusterId @@ -82, +82 @@ * For each story si store the covisitation counts with other stories c(si, sj ) * Candidate story: sk * User history: s1,â¦, sn - * score (si, sj ) = c(si, sj )/âm c(si, sm ) + * score (si, sj) = c(si, sj)/âm c(si, sm) - * total_score(sk) = ân score(sk, sn ) + * total_score(sk) = ân score(sk, sn) ---- = References = * Google News Personalization: Scalable Online Collaborative Filtering