[ https://issues.apache.org/jira/browse/SPARK-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581359#comment-15581359 ]
Debasish Das commented on SPARK-4823: ------------------------------------- We use it in multiple usecases internally but did not get time to refactor the PR into 3 smaller PRs....I will update the PR to 2.0 > rowSimilarities > --------------- > > Key: SPARK-4823 > URL: https://issues.apache.org/jira/browse/SPARK-4823 > Project: Spark > Issue Type: Improvement > Components: MLlib > Reporter: Reza Zadeh > Attachments: MovieLensSimilarity Comparisons.pdf, > SparkMeetup2015-Experiments1.pdf, SparkMeetup2015-Experiments2.pdf > > > RowMatrix has a columnSimilarities method to find cosine similarities between > columns. > A rowSimilarities method would be useful to find similarities between rows. > This is JIRA is to investigate which algorithms are suitable for such a > method, better than brute-forcing it. Note that when there are many rows (> > 10^6), it is unlikely that brute-force will be feasible, since the output > will be of order 10^12. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org