Re: Lucene for UserSimilarity

Sean Owen Tue, 24 May 2011 03:36:46 -0700

On Tue, May 24, 2011 at 11:31 AM, Uwe Reimann <[email protected]> wrote:
> Probably depends on how many data point were available before. I suspect
> i.e. the 5th data point having a greater impact than the 105th. Is there a
> lower limit (above 1) on the number of data points a user must have before
> recommendations make sense?


Right. There's not one answer to that. A few data points can be
meaningful enough to make recommendations from though more is
generally better.

> I did some testing of the different recommenders on a real data set from a
> bookmarking site. GenericBooleanPrefItemBasedRecommender did not work very
> well for me. It seemed to recommend the top links. Using
> GenericUserBasedRecommender worked way better (after some tweaking), which
> recommended links that actually fit my interests. Might need to do some more
> testing here.

Were you using "compatible" similarity implementations? Pearson is
meaningless on boolean data and you would get poor results.

Or -- there is GenericItemBasedRecommender, which does use ratings,
and Pearson is fine with this implementation.


> (1) would include categories, that should not be recommended, that's why (2)
> is being used to pick the recommendations from. (2) would contain the liked
> items of every user, that includes items that are disliked by other users.
> (3) is for filtering out items that the user has not rated, but has been
> presented before.

I see. Yes it's entirely possible to compute user-user or item-item
similarity on one model, and then apply those similarities to a
recommender based on another model.

(3) doesn't need a DataModel per se, but yes needs access to some list
of previously-seen items in some form. up to you.

Re: Lucene for UserSimilarity

Reply via email to