Here we go:

On Wed, Dec 10, 2014 at 9:01 PM, Debasish Das <>

> I added code to compute topK products for each user and topK user for each
> product in SPARK-3066..
> That is different than row similarity calculation as we need both user and
> product factors to calculate the topK recommendations..
> For (1) and (2) we are trying to answer similarUsers to given a user and
> similarProducts to a given product....
> similarProducts to a given product is straightforward to compute through
> columnSimilarities/dimsum when products are skinny...
> similarUser to a given user will need a map-reduce implementation of row
> similarity since the matrix is tall...
> I don't see a JIRA for that yet...Are there any good reference for map
> reduce implementation of row similarity ?
> On Wed, Dec 10, 2014 at 2:30 PM, Reza Zadeh <> wrote:
>> It's not so cheap to compute row similarities when there are many rows,
>> as it amounts to computing the outer product of a matrix A (i.e. computing
>> AA^T, which is expensive).
>> There is a JIRA to track handling (1) and (2) more efficiently than
>> computing all pairs:
>> On Wed, Dec 10, 2014 at 2:44 PM, Debasish Das <>
>> wrote:
>>> Hi,
>>> It seems there are multiple places where we would like to compute row
>>> similarity (accurate or approximate similarities)
>>> Basically through RowMatrix columnSimilarities we can compute column
>>> similarities of a tall skinny matrix
>>> Similarly we should have an API in RowMatrix called rowSimilarities where
>>> we can compute similar rows in a map-reduce fashion. It will be useful
>>> for
>>> following use-cases:
>>> 1. Generate topK users for each user from matrix factorization model
>>> 2. Generate topK products for each product from matrix factorization
>>> model
>>> 3. Generate kernel matrix for use in spectral clustering
>>> 4. Generate kernel matrix for use in kernel regression/classification
>>> I am not sure if there are already good implementation for map-reduce row
>>> similarity that we can use (ideas like fastfood and kitchen sink felt
>>> more
>>> like for classification use-case but for recommendation also user
>>> similarities show up which is unsupervised)...
>>> Is there a JIRA tracking it ? If not I can open one and we can discuss
>>> further on it.
>>> Thanks.
>>> Deb

Reply via email to