If you just need a similarity metric, you don't need a recommender -- of which similarity is just a part. If the movie is 'user' and genre is 'item' then you just use a UserSimilarity implementation to figure the similarity between any two movies. You don't need anything more than that.
On Thu, May 10, 2012 at 7:29 AM, Daniel Quach <[email protected]> wrote: > Well, actually, I wanted to represent each movie with a vector > > [1, 0, 0, 1, 0] > > Where each column represents an explicit genre, a 1 indicating that the > movie has that genre while a 0 indicates it is not (a crude representation, > I'm sure) > > I wanted to implement an item based recommender that uses these vectors to > compute similarity between items. > > I think I figured it out, I could represent vector data as preferences > where instead of user ID's, it would be column indices. Then load that into > a DataModel for use with the ItemSimilarity object. The > ItemBasedRecommender could load the DataModel with userID's while using > this ItemSimilarity object for calculating similarities. > > This could possibly be a poor choice from an efficiency, accuracy, and > machine learning standpoint, I am not an expert on the subject at all. > > On May 8, 2012, at 12:58 AM, Sean Owen wrote: > > > So you have already decided, for each movie, whether it's in or not in > each > > genre? And then you want to create a "profile" -- assuming you mean some > > kind of meta-genre? > > > > This isn't a recommender problem; it's just a clustering problem. I'd use > > the Tanimoto similarity. > > You could run the clustering-based recommender just to build the > clusters. > > You wouldn't use it for recommendations. > > > > On Tue, May 8, 2012 at 8:53 AM, Daniel Quach <[email protected]> > wrote: > > > >> Suppose that I want to give each movie a profile based on the genres > each > >> contains. > >> > >> For naive and simplistic purposes, let's pretend that each movie has a > >> vector where each column is a genre, a 1 in that column indicates that > the > >> movie contains that genre, 0 otherwise. > >> > >> How would I feed such data into an Item-based Recommender? I want this > >> recommender to use these vectors for calculating similarity for > >> recommendations, which in turn is used for preference estimation (just > as > >> described in section 4.4.1 of the Mahout in Action book) > >> > >> The example in the book is not immediately clear to me. The sample code > >> does not mention the format of the data being used in creating the > >> ItemSimilarity object. > >
