You mean that there are two items with near identical data, and one shows
up in recs and the other doesn't? I can make a general guess, and it comes
down to the fact that your similarity data isn't "transitive". This comes
up in a minor and a major way.

The minor way is just theoretical: these metrics aren't necessarily by
nature transitive; your modification, maybe less so. A and B and C may be
individually pairwise fairly similar, but they don't necessarily obey the
"triangle inequality". Yet, good metrics ought to be transitive-ish, with
enough good data.

The more major point is practical: it may be that A and B and C are all
pairwise very similar, but perhaps your data lacks enough information to
know enough about one of the pairs. Maybe A and B are similar, and A and C
are similar, and B and C are too but the B-C similarity can't be computed,
or computed well, due to sparsity or missing data.

So then: a user who rates C would perhaps only see A recommended, even
though A and B are intuitively both nearly equally good. B is good too; the
data may just not be able to know it.


On Sun, Jan 29, 2012 at 10:05 PM, Nick Jordan <[email protected]> wrote:

> Hey There,
>
> Quick questions about the expected behavior when using a customer item
> similarity model that extends AbstractItemSimilarity within a
> KnnItemBasedRecommender.  Basically I have some logic in doItemSimilarity()
> that returns 1.0 in certain cases.
>
> When then looking at actual recommendations for a user, I notice that some
> of the items who should have perfect similarity based on the above logic
> have wildly varying recommendations.  I'm not sure that this is unexpected
> behavior, but if it is not I'd like to understand why two items with
> perfect similarity would have different estimated preferences for the same
> user.  The user has not actually rated any of the items in question.
>
> Thanks in advance.
>
> Nick
>

Reply via email to