On Apr 4, 2013, at 5:17 PM, Pat Ferrel wrote:

> One issue with the method below is that the two source matrices would not 
> have values for all users or items (rows or columns). I do know the entire 
> user and item id space from a previous step so I know the # of rows including 
> blank ones and # of columns even though some are empty. Put another way the 
> Actual matrix (with empty rows or columns) may be larger than the number of 
> rows in the DistributedRowMatrix or unique item ids. However all ids in one 
> matrix will match the ids of the other matrix. 

As I mentioned in the other response I sent, only the user id's need to match.

Any item whose column in B is all zero cannot be recommended since we have 
never seen it.  The math won't change.  

Any item whose column in A is all zero cannot become an indicator.  That 
probably doesn't matter either since an item we have not yet seen is probably 
rare and in any case, we know nothing about it.

Any zero row of either A or B will case that user's behavior to be ignored for 
the purposes of cross recommendation.  That is, again, as it should be since 
any user who has not participated in both behaviors cannot provide information 
about the linkage.

> 
> AFAICT this should be OK for the TransposeJob and MatrixMultJob but I haven't 
> tested it. I will need to pass in the size of the matrices as the size of the 
> user and item space, Correct?

Yes.

Reply via email to