Hey Sean, Thank you for the detailed reply. Interesting points. I think I have approached some of these points in my subsequent emails.
You bring up the case where all the users hate the same item. What about the case where very few (a single?) similar users loves a place? In that case, is this really a better recommendation than the popular vote? Where is the middle ground. I think its an interesting point. Ill see how the SVD performs. On Feb 18, 2011, at 11:20 PM, Sean Owen wrote: > User-user similarity is based on these counts? That sounds a bit like > the Tanimoto / Jaccard coefficient.See TanimotoCoeffcientSimilarity. > Yes you can use that though log-likelihood is probably a more > sophisticated choice. > > Recommending an item that occurs most in the neighborhood? Sure you > can make it work that way. It probably works "OK" in practice though > you can see possible problems with it. What if everyone in the > neighborhood hates an item? this would recommend it highly. It's also > throwing away the degree of similarity to the user who likes an item. > > The conventional wisdom in recommenders is that you want to fight the > tendency to always recommend well-known items. People probably already > know about the well-known items even if they've not rated them yet. It > also makes the recommendations less personalized in a sense -- the > recommendation result approaches the one you'd get by just > recommending the globally most-preferred items. > > If your goal is to fight sparseness, start looking at SVD-based > methods. This is really the point of SVDs, to "summarize" a very > high-dimensional user-item matrix in a much lower-dimensional "user > group" - "item group" matrix. Maybe you don't have enough information > to recommend Bauhaus to Joan, a teenage goth, but, the SVD lets you > sort of draw conclusions like "gothy teens like Peter Murphy's > albums". That is the summary is much less sparse and so works better > for recommendation for users/items with little connection to the rest > of the matrix otherwise. > > > On Sat, Feb 19, 2011 at 2:43 AM, Chris Schilling <[email protected]> wrote: >> Hello again, >> >> Very simple question here: I am also testing the user-user cf in mahout. >> So, once I define my user neighborhood, is it possible to select the >> recommendations from that based on the number of preferences per item rather >> than a weighted average? Basically, I'd like to recommend the items with >> the most preferences. It would be simple to implement, so I was curious if >> this was already possible. I understand that in this case, the counts >> become dependent on the size of the neighborhood. This is something I'd want >> to use for testing. >> >> Thanks >> Chris
