A is the user x item history matrix.  Each row is a user history.

A' is the transposed user x item matrix which is of the shape item x user.

A' A is the user-level item cooccurrence matrix and has the shape item x
item.
There is one row for each item and each row contains items recommended for
the item corresponding to the row.

(A' A) h is a sum of item level recommendations.

BUT remember that it is the shape of these computations that matters rather
than
the explicit operations.  Thus,

     A could be session x item instead of user x item

     A' A could be weighted by IDF considerations

     A' A could be binarized to have only items with large LLR score.  In
fact, if A is
binary then A'A as computed using literal matrix multiply is exactly what
the LLR
score needs.  Thus we could replace A'A with ( LLR(A'A) > 10 ) or some
such.  The
pattern of computation is still the same.

Historically, there are several ways I have dealt with missing observations.

- for binary A, missing elements are most naturally 0.  This comes about
because
we are counting number of positive observations and 0 is the natural
identity for
counting.

- for situations with useful ratings, it is better to use one or more
binarized versions
of A.

    In typical cases where positive ratings dominate the process, then A can
contain 1 where there is a binary rating and 0 elsewhere and we have the
normal
binary case.

    In other cases, it is sometimes paradoxically useful to consider ALL
ratings as positive and then filter out explicit negative ratings from the
recommendation
results.  The reason that this works is that people often rate things
negatively that
are very close to things that they like.  Things very dissimilar from what
they like,
they never even look at, much less rate.  Thus, negative ratings may
actually be
more useful as indicators of what a person likes than as indicators about
what they don't.
I call this effect the "love spurned" effect.

    Finally, you can keep separate A matrices that are each binary and which
correspond
to negative and positive ratings respectively.  Then you proceed with both
computations
and get positive and negative recommendations.  Combining these
recommendations
has always been difficult for me, largely because the negative
recommendations are
confounded by the love spurned effect.


On Wed, Sep 9, 2009 at 1:39 AM, Sean Owen <[email protected]> wrote:

> So maybe let's visit the matrix approach a bit here -- what is A'A? is
> this the similarity matrix? working backwards, seems like that's the
> thing to left-multiply with the user rating vector to get
> recommendations.
>
> The first question I have is how does this cope with missing elements?
> I understand the idea is to use a sparse representation, but, the
> nature of the computation means these elements will be treated as
> zero, which doesn't work.
>



-- 
Ted Dunning, CTO
DeepDyve

Reply via email to