Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/NewsPersonalizationSystem ------------------------------------------------------------------------------ == Initial Contributors == * [:udanax:Edward Yoon] (R&D center, NHN corp.) - == Algorithm Overview == + = Algorithm Overview = * Obtain a list of candidate stories * For each story: @@ -23, +23 @@ * May not be clustered * Rely on co-visitation to generate recommendations ---- - == NPS Architecture == + = NPS Architecture = {{{ +----------------------------+ @@ -49, +49 @@ +-------------------------------------------------------------------------+ }}} ---- - == Clustering Algorithms == + = Clustering Algorithms = - === User clustering - MinHash === + == User clustering - MinHash == * Input: User and his clicked stories * ~+S+~,,u,, = {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m} * User similarity = | S,,u1,, I S,,u2 | / |S,,u1, Y S,,u2,, | * Output: User clusters. * Similar users belong to same cluster - ==== MinHash ==== + === MinHash === * Randomly permute the universe of clicked stories * {s^u^,,1,, , s^u^,,2,, , ... , s^u^,,m} = {s^'^,,1,, , s^'^,,2,, , ... , s^'^,,m} @@ -68, +68 @@ * Treat !MinHash value as !ClusterId * Probabilistic clustering - === Clustering - PLSI Algorithm === + == Clustering - PLSI Algorithm == * Learning (done offline) * ML estimation @@ -77, +77 @@ * P[zj|u]âs lead to a soft clustering of users * Runtime: we only use P[zj|u]âs and ignore P[s|zj]âs - === Covisitation count === + == Covisitation count == * For each story si store the covisitation counts with other stories c(si, sj ) * Candidate story: sk @@ -85, +85 @@ * score (si, sj ) = c(si, sj )/âm c(si, sm ) * total_score(sk) = ân score(sk, sn ) ---- - == References == + = References = * Google News Personalization: Scalable Online Collaborative Filtering * Bigtable paper: OSDI 2006