[ 
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242808#comment-14242808
 ] 

Sean Owen commented on SPARK-4675:
----------------------------------

The lower dimensional space is of course smaller. This makes it faster and more 
efficient to work with, which is an advantage to be sure at scale. But the real 
reason is that the original high-dimensional space is extremely sparse. 
Standard similarity measures are undefined for most pairs, or are 0. It's sort 
of a symptom of the curse of dimensionality. 

> Find similar products and similar users in MatrixFactorizationModel
> -------------------------------------------------------------------
>
>                 Key: SPARK-4675
>                 URL: https://issues.apache.org/jira/browse/SPARK-4675
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>            Reporter: Steven Bourke
>            Priority: Trivial
>              Labels: mllib, recommender
>
> Using the latent feature space that is learnt in MatrixFactorizationModel, I 
> have added 2 new functions to find similar products and similar users. A user 
> of the API can for example pass a product ID, and get the closest products. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to