BTW I have this working on trivial data and am in the process of measuring it's 
results on some real world data. It does a lot with DistributedRowMatix and so 
I'll be interested to see how it performs with a larger data set. 

Does anyone know of a public data set that provides things like views and 
purchases? 


On Apr 8, 2013, at 2:31 PM, Ted Dunning <[email protected]> wrote:

On Sat, Apr 6, 2013 at 3:26 PM, Pat Ferrel <[email protected]> wrote:

> I guess I don't understand this issue.
> 
> In my case both the item ids and user ids of the separate DistributedRow
> Matrix will match and I know the size for the entire space from a previous
> step where I create id maps. I suppose you are saying the the m/r code
> would be super simple if a row of B' and a  column of A could be processed
> together, which I understand as an optimal implementation.
> 

Well.... rows of B and A should match so columns of B' and rows of A rather
than the reverse.


> So calculating [B'A] seems like TransposeJob and MultiplyJob and does seem
> to work. You loose the ability to substutute different RowSimilarityJob
> measures. I assume this creates something like the co-occurrence similairty
> measure. But oh, well. Maybe I'll look at that later.
> 

Yes.  Exactly.


> I also see why you say the two matrices A and B don't have to have the
> same size since [B'A]H_v = [B'A]A' so the dimensions will work out as long
> as the users dimension is the same throughout.
> 

Yes.  All we need is user id match.

Reply via email to