A 1.0 similarity doesn't usually mean that the users have rated exactly the same items -- it means -of the items they co-rated-, there is a perfect linear relationship between their ratings. So in general, no I wouldn't discount such pairs of users.
It is pretty rare of course. It is less rare, I think, in your case, since you are simply using a 'boolean' seen/not seen value for each item, plus something like the Tanimoto similarity metric. In that case, a 1.0 similarity actually does mean they have seen the same items. If it is therefore an issue for you, we could work out some reasonable way to inject this observation into the framework. I think Otis is doing a sort of form of 'dimension reduction' already here, by removing items from consideration that don't add much information. That's kind of the same thing in this simplified scenario where we don't have rating vectors really, just seen/not seen values. Indeed it speeds things up a lot. I am under the impression Otis needs real-time recommendations in his case. On Mon, Jun 1, 2009 at 10:06 PM, Otis Gospodnetic <[email protected]> wrote: > > Hello, > > I was stepping through Taste and noticed that users with 1.0 similarity to > the target user end up in that user's neighbourhood. 1.0 similarity between > users means users are exactly the same, so is there a point in collecting > them? Since they are exactly the same as the target user, we can't really > get any new items to recommend from them. Is this correct? > > > It's probably not a frequent case to have users with identical item > preferences, but imagine a case where you are computing recommendations from > top 10 most similar users and those 10 most similar users happen to be all > perfectly similar users, thus yielding no recommendations. > > Thoughts? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > >
