[
https://issues.apache.org/jira/browse/SPARK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242808#comment-14242808
]
Sean Owen commented on SPARK-4675:
----------------------------------
The lower dimensional space is of course smaller. This makes it faster and more
efficient to work with, which is an advantage to be sure at scale. But the real
reason is that the original high-dimensional space is extremely sparse.
Standard similarity measures are undefined for most pairs, or are 0. It's sort
of a symptom of the curse of dimensionality.
> Find similar products and similar users in MatrixFactorizationModel
> -------------------------------------------------------------------
>
> Key: SPARK-4675
> URL: https://issues.apache.org/jira/browse/SPARK-4675
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Steven Bourke
> Priority: Trivial
> Labels: mllib, recommender
>
> Using the latent feature space that is learnt in MatrixFactorizationModel, I
> have added 2 new functions to find similar products and similar users. A user
> of the API can for example pass a product ID, and get the closest products.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]