inline On Apr 3, 2013, at 6:15 PM, Pat Ferrel wrote:
> The non-symmetry of the [B'A] and the fact that it is calculated from two > models leads me to a rather heavy handed approach at least for a first cut. > > Let me know if this seems right: > > //calculate the 'cross' co-occurrence matrix > B = PreparePreferenceMatrixJob using user purchase prefs > A = PreparePreferenceMatrixJob using user view prefs > // note that users and items *must* be the same for A and B, their ids > must map to the same things and this may be a challenge. > B' = TransposeJob on B > [B'A] = MatrixMultJob on B', A > Actually, only the user id's need to be the same. The whole point of this approach is that the item id's will be different for A and B. Even if you conceptually have something where the item id's could be the same, it is good to reserve the right for the set of id's for A to be much larger and more comprehensive than for B. This just means that we want to use a broad universe of indicators for things. > Now in the standard recommender we get the magic with RowSimilarity and > 'partial' multiplies. I haven't teased the partial multiplies apart but I > suspect that since they use and rely on the output from RowSimilarityJob I'll > need to rework this--please correct me if I'm wrong. Once I have [B'A] I need > to : > [B'A] * H_v, where H_v is the original user history vectors in A based on > user's views. I think they need to be column vectors so H_v = A' so > [B'A]A' = DitributedRowMatrix of recommendations by user. H_v could be one of the original history vectors or a novel history vector. Makes no never-mind. But you are right about (B'A) A' being historical recommendations for everybody in A. > I'm most interested in item similarity so I think the [B'A] needs > RowSimilarityJob run on it but it is the columns I need to compare (???) so > [B'A]' = rows of items with values that are views that lead to (co-occur > with) purchases > RowSimilarityJob on [B'A]' will calculate pairwise similarity of items and > so will create a matrix of item similarities. Here I suppose I can apply any > of the similarity classes. Actually, I think that B'A has rows taken from the b-items and columns taken from the a-items. Each row of B'A consists of a row of indicators that lead to the recommendation of the item corresponding to that row. This is a very handy form that needn't be further transformed. > Question: > * Have I got the item similarity part right, do I need to compare columns of > [B'A]? I don't see why you would need to.
