I think it would be surprising behavior for a recommender to return data it
already knows; I just think the implicit contract is to return only
predictions. That's how real-world recommender systems appear to behave, to
the end user; Amazon doesn't show you books you have already read, even if
indeed they may be some of your favorites ever.

That's how it's built now anyway so would prefer not to change it, because
you can combine with data you already have, if that's what you want, more
easily than you can strip out the data you already have from the result, if
that's not what you want. You also run the risk of the top items being all
existing data points; then the recommender is not providing any useful
extra info.

You can make RecommendedItem for all existing data points, mix with
recommendations, and sort.

If you don't have rating values, then you can't use a recommender built on
predicting ratings, since they will all be 1, and your result is as you say
random. The answer is, don't do that! Either you don't use ratings, and use
the boolean versions, or you do use ratings (like your decaying click
value) and then you can use either.

On Fri, Jan 27, 2012 at 10:09 AM, Anatoliy Kats <[email protected]>wrote:

> So you're proposing that we separate the actions of estimating preferences
> for unknown items, and recommending items to users to click :  the latter
> could include some items for which a preference has been expressed.  It's a
> good idea to think that way, thanks for the tip.  I would argue, though,
> that .recommend() is aimed at the latter task:  it predicts preferences,
> and sorts them, and returns the top N items.  It is a final step in a
> process that includes unknown preference estimation as an intermediate
> step.  This is built into Mahout as I see it, by separating .recommend()
> and .estimatePreference().  That's why I still think the most elegant
> solution is simply adding known preference values to the predicted ones to
> the set of possible recommendations.  AFAIK this is most easily
> accomplished by playing around with CandidateItemStrategies.  How would you
> go about it without having to write your own sorting function?
>
> About boolean recommenders:  Many of my users made no purchases, only
> clicks.  So, if I use a generic recommender, it will make random
> recommendations because my training data is essentially boolean.  Has
> anyone else run into this problem?  One solution I am about to try is
> letting the rating value of a click decay with time since the click was
> made.  I am not sure if the ratings will be different enough for
> GenericRecommender to work, and I am also not sure I am justified in
> reducing the item similarity between two items because two users clicked on
> them at different times.  Has anyone tried a solution based on a
> regularized normalization of some sort?
>
> Thanks.
>
>

Reply via email to