It stands to reason that if you click on an Amazon book description, you should be offered to buy it next time. Amazon certainly does. I see Sean's point, in a pure form a recommender should only recommend unknown items. Certainly that's the behavior you need in a theoretical test framework. I think where we differ is that some people here build systems where ratings are computed from user behavior, and therefore decoupled from the set of candidate items. I understand this is not Mahout's original purpose, and that it's difficult to build this support into Mahout in a principled way. But it would be helpful to some of us if Mahout had that capability. I accomplished it by wrapping my own recommender class around Mahout's delegate, overriding estimatePreference(), and using a CandidateSimilarity that allows previously rated items. Perhaps a more acceptable solution is adding a post-recommendation processing step that combines the recommendation result with a set of userChoosableRecommendedItems, and returns the top N items from that combined list. This design intrudes a lot less into Mahout's internals.

Would anyone else benefit from this addition?

On 01/29/2012 12:33 AM, Ted Dunning wrote:
Also, Lee, I think you have it backwards.  It is true that clicks are not
the same thing as preferences, but I don't think that I give a fig about
preferences since they are the internal mental state of the visitor.  I
know that some of the visitors are mental, but I don't really care since
that is their own business.  What I care about is encouraging certain
behaviors.

So I find that mental state estimations are the indirect way to model and
predict behaviors while directly modeling behaviors based on observed
behaviors is, well, more direct.

This is compounded by the fact that asking people to rate things invokes a
really complicated social interaction which does not directly sample the
internal mental state (i.e. the real preference) but instead measures
behavior that is a very complicated outcome of social pressures,
expectations and the internal mental state.  So using ratings boils down to
using one kind of behavior to estimate mental state that then is
hypothesized to result in the desired second kind of behavior.



On Sat, Jan 28, 2012 at 10:51 AM, Sean Owen<[email protected]>  wrote:

It means *something* that a user clicked on one item and not 10,000 others.
You will learn things like that Star Wars and Star Trek are somehow related
from this data. I don't think that clicks are a bad input per se.

I agree that it's not obvious how to meaningfully translate user actions
into a linear scale. "1" per click and "10" for purchase or something is a
guess. I do think you will learn something from the data this way.

There is nothing conceptually wrong with mixing real data and estimated
data. If the results aren't looking right, it is not a problem with the
concept, but the mapping of action onto some rating scale. I think it's
hard to get that right, but is not impossible to get it "good".

On Sat, Jan 28, 2012 at 10:15 AM, Lee Carroll
<[email protected]>wrote:

I would argue, though, that .recommend() is aimed at the latter task:
No . I think the mismatch here is you are using at best a wild guess
at a preference for the convenience of using a recommender and then in
the same breath expecting the recommender to "understand" that you are
not using preferences at all and actually have no idea what the user
preference is. You cant have it both ways :-)

A click through on an item is not a measure of user preference for
that item. I know its not what you want to hear (or better what your
business users want to here) but there it is.

We can pretend, or maybe even build a convincing narrative that a
click is some sort of item association and use that as a proxy
preference and we might even get some mileage out of it, but we should
not change the behaviour of the .recommend() to hide its short
comings.


Reply via email to