The simplest way to cluster users would be to take the output of PreparePreferenceMatrixJob, which creates a DistributedRowMatrix (DRM) of all user prefs. The rows are users the columns items, the values are preference values. Cluster the rows. Transpose that matrix and clustering rows will give you item clusters--nifty. On Sep 17, 2013, at 1:41 PM, "Martin, Nick" <[email protected]> wrote:
Hi all, I'm looking for the best way to get user clusters from my recommendation output. Idea being I have my recommended items for users (user, item, score) based on their preferences but I want to see how the users were clustered together (and their similarity) so I can run some other analytics on those clusters. I found some discussion on this here (http://lucene.472066.n3.nabble.com/Turning-Preference-Files-Into-Vectors-td640035.html) but I'm not sure if any updates have been made since this thread that would make this a bit easier? If not, is what's discussed in the thread my best approach? Hope that makes sense... Thanks, Nick
