Re: A question regarding GenericUserBasedRecommender

2010-08-13 Thread Yanir Seroussi
I agree with both of you (Ted and Sean) regarding the generation of
real-world recommendations, and I also think that it makes sense to use a
static neighbourhood when generating recommendations, because it will be too
slow to estimate the target user's rating of all the items in the system.

I still think that it's worthwhile to compare rating errors of different
algorithms on all items, because in general it is likely that the more
accurate algorithm will generate better recommendations. Comparing the
actual generated recommendations is not as simple as comparing rating
errors, though it is very important. As Ted said, you need to worry about
the ordering of the top few items, and you typically don't want all the
recommended items to be too similar to each other or too obvious. However,
capturing all these qualities in a single metric is quite hard, if not
impossible.

In any case, I think that the current behaviour should be documented, either
in the javadocs or in the wiki or both. I'm happy to update the
documentation if it's okay with you.

On Fri, Aug 13, 2010 at 06:09, Ted Dunning ted.dunn...@gmail.com wrote:

 Focussing on rating error is also problematic in that it causes us to worry
 about being correct about the estimated ratings for items that will *never*
 be shown to a user.

 In my mind, the only thing that matters in a practical system is the
 ordering of the top few items and the rough composition of the top few
 hundred items.  Everything else makes no difference at all for real-world
 recommendations.

 On Thu, Aug 12, 2010 at 12:53 PM, Sean Owen sro...@gmail.com wrote:

  I agree with your reading of what the Herlocker paper is saying. The
  paper is focused on producing one estimated rating, not
  recommendations. While those tasks are related -- recommendations are
  those with the highest estimated ratings -- translating what's in
  Herlocker directly to a recommendation algorithm is a significant
  jump.
 



Re: A question regarding GenericUserBasedRecommender

2010-08-12 Thread Ted Dunning
Focussing on rating error is also problematic in that it causes us to worry
about being correct about the estimated ratings for items that will *never*
be shown to a user.

In my mind, the only thing that matters in a practical system is the
ordering of the top few items and the rough composition of the top few
hundred items.  Everything else makes no difference at all for real-world
recommendations.

On Thu, Aug 12, 2010 at 12:53 PM, Sean Owen sro...@gmail.com wrote:

 I agree with your reading of what the Herlocker paper is saying. The
 paper is focused on producing one estimated rating, not
 recommendations. While those tasks are related -- recommendations are
 those with the highest estimated ratings -- translating what's in
 Herlocker directly to a recommendation algorithm is a significant
 jump.



Re: A question regarding GenericUserBasedRecommender

2010-08-10 Thread Yanir Seroussi
On Tue, Aug 10, 2010 at 15:54, Sean Owen sro...@gmail.com wrote:

 On Mon, Aug 9, 2010 at 6:46 PM, Yanir Seroussi yanir.serou...@gmail.com
 wrote:
  As I see it, there are two separate tasks here. The first one is the
  recommendation task, where it makes sense to take the N most similar
 users
  and generate recommendations based on their preferences. The second one
 is
  the rating prediction task (or preference estimation in Mahout terms),
 where
  given a target user and a target item the recommender generates a
 prediction
  of the rating for the item. In the way

 They are related tasks. Estimating preferences is *a* way to achieve
 the further goal of recommending items. It is how just about all
 algorithms work, so I'd say that in practice estimating preference is
 a sub-task of recommendation, and really the core of it.


Agreed.


  GenericUserBasedRecommendation.estimatePreference() is implemented now,
 it
  can only generate predictions for very few items unless the neighbourhood
  size is very large.

 I don't think this is necessarily true. Sparse data can mean users are
 hard to connect to other users, or that those connections yield few
 candidates for recommendations, sure. A large neighborhood doesn't
 help this though. If there isn't 1 nearby user, there aren't 100
 either.

 But, crudely, I think your data has to be really really sparse
 before you frequently can't recommend anything at all.

 I don't believe it's a question of implementation. This is simply what
 user-based recommendation is.


As it is now, GenericUserBasedRecommendation.estimatePreference() can only
estimate preferences of items that were rated by users in the static
neighbourhood. Thus, a larger neighbourhood is likely to contain more
neighbours that rated unknown items. For example, if we take only one
neighbour, we can only estimate preferences of items that this neighbour
rated.




  To make this concrete, I'll give an example. Suppose we choose N=20, and
 we
  want to predict the rating of the target user for the movie The
 Godfather.
  This movie was rated by many users in the dataset, but unfortunately it
  wasn't rated by the 20 users that are most similar to the target user.
 Thus,
  estimatePreference() will return NaN as the prediction.

 That's right. There would be no basis for estimating a preference.


  Now suppose that these top 20 users have a similarity of 1 to the target
  user, and that the next 20 users have a similarity of 0.99 to the target
  user, and they have all rated The Godfather. It would make sense to
 generate
  a rating prediction based on the next 20 users, and this prediction is
  actually likely to be pretty good.

 True, and if this were the case, I'd suggest that you want to define
 the neighborhood by a similarity threshold, not by size. Let everyone
 with similarity = 0.9 or something be in the neighborhood.


  As I mentioned in my original message, it is implied from Herlocker et al
  (1999) that the number of neighbours shouldn't affect coverage. Another
  widely cited paper, by Adomavicius and Tuzhilin (2005, see
 
 http://ids.csom.umn.edu/faculty/gedas/papers/recommender-systems-survey-2005.pdf
 ),
  actually states that explicitly: That is, the
  value of the unknown rating r(c, s) for user c and item s is usually
  computed as an aggregate of the ratings of some other (usually the N most
  similar) users for the same item s: r(c, s) = aggr(c' in C^, r(c', s))
 where
  C^ denotes the set of N users that are the most similar to user c and who
  have rated item s (N can range anywhere from 1 to the number of all
 users).
 
  While Sean's approach is probably also okay, especially for the
  recommendation scenario, I think it's worthwhile documenting the fact
 that
  this behaviour is somewhat inconsistent with the literature.

 As (I think) you say, it's consistent with Herlocker -- this is just
 plain-vanilla user-based recommendation.

 The other approach you cite makes sense but gets a lot more expensive
 to compute. You need a new neighborhood for every candidate item. It
 is also, in my world view, not the plain-vanilla approach that has
 become called user-based recommendation. So, no I do not find this
 inconsistent.


No, I'm not saying that it's consistent with Herlocker et al. They say in
section 6 that The second technique is to pick the best n correlates for a
given n. This technique performs reasonably well, as it does not limit
prediction coverage. The fact that no matter what n you choose, your
coverage remains the same, means that the n users are dependent on the
target item. Also, in section 2 of the paper, they say that the second step
in generating a prediction is select a subset of users to use as a set of
predictors (possibly for a specific item). The other neighbourhood
selection technique they experimented with was setting similarity threshold,
which did affect coverage, as it is not item-specific. Putting all of this
together strongly suggests 

Re: A question regarding GenericUserBasedRecommender

2010-08-09 Thread Yanir Seroussi
I'll try to explain the problem again, but note that at this stage I'm
trying to establish a user-based collaborative recommendation baseline,
rather than getting the best performance I can get for a given dataset, so
while the suggestions are appreciated I'd like to focus on
GenericUserBasedRecommender with NearestNUserNeighborhood.

As I see it, there are two separate tasks here. The first one is the
recommendation task, where it makes sense to take the N most similar users
and generate recommendations based on their preferences. The second one is
the rating prediction task (or preference estimation in Mahout terms), where
given a target user and a target item the recommender generates a prediction
of the rating for the item. In the way
GenericUserBasedRecommendation.estimatePreference() is implemented now, it
can only generate predictions for very few items unless the neighbourhood
size is very large.

To make this concrete, I'll give an example. Suppose we choose N=20, and we
want to predict the rating of the target user for the movie The Godfather.
This movie was rated by many users in the dataset, but unfortunately it
wasn't rated by the 20 users that are most similar to the target user. Thus,
estimatePreference() will return NaN as the prediction.
Now suppose that these top 20 users have a similarity of 1 to the target
user, and that the next 20 users have a similarity of 0.99 to the target
user, and they have all rated The Godfather. It would make sense to generate
a rating prediction based on the next 20 users, and this prediction is
actually likely to be pretty good.

As I mentioned in my original message, it is implied from Herlocker et al
(1999) that the number of neighbours shouldn't affect coverage. Another
widely cited paper, by Adomavicius and Tuzhilin (2005, see
http://ids.csom.umn.edu/faculty/gedas/papers/recommender-systems-survey-2005.pdf),
actually states that explicitly: That is, the
value of the unknown rating r(c, s) for user c and item s is usually
computed as an aggregate of the ratings of some other (usually the N most
similar) users for the same item s: r(c, s) = aggr(c' in C^, r(c', s)) where
C^ denotes the set of N users that are the most similar to user c and who
have rated item s (N can range anywhere from 1 to the number of all users).

While Sean's approach is probably also okay, especially for the
recommendation scenario, I think it's worthwhile documenting the fact that
this behaviour is somewhat inconsistent with the literature.

On Tue, Aug 10, 2010 at 00:32, Sean Owen sro...@gmail.com wrote:

 This is expected behavior as far as I understand the algorithm. I
 don't see how a user-based recommender can estimated a preference by X
 for Y if nobody who rated Y is connected to X at all.

 You can use a PreferenceInferrer to fill in a lot of missing data, but
 I don't really recommend this for more than experimentation.

 The issue here is mostly that the user-item matrix is too sparse. And
 yes there are load of follow-up suggestions that tackle that,
 depending on your data, as alex hinted at.

 On Mon, Aug 9, 2010 at 3:31 AM, Yanir Seroussi yanir.serou...@gmail.com
 wrote:
  Hi,
 
  The first example here (
 
 https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation
 )
  shows how to create a GenericUserBasedRecommender with a
  NearestNUserNeighborhood. My problem/question is that setting n to any
 small
  number seems to limit the coverage of the recommender, because the
 nearest n
  users are calculated without taking the target item into account.
  For example, given a user X and n = 10, if we want to
 estimatePreference()
  for an item Y, if this item is not rated by any user in the
 neighbourhood,
  the prediction will be NaN. I don't think that this is what one would
 expect
  from a user-based nearest-neighbour recommender, as Herlocker et al.
 (1999),
  who are cited in the example page above, didn't mention any change in
  coverage based on the number of nearest neighbours.
  Am I doing something wrong, or is this the way it should be? I have a
  feeling it is not the way it should be, because then using small
  neighbourhood sizes makes no sense as it severely restricts the ability
 of
  the recommender to estimate preferences.
 
  Please note that I observed this behaviour in version 0.3, but it seems
 to
  be the same in the latest version.
 
  Cheers,
  Yanir