[
https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237074#comment-15237074
]
Nick Pentreath commented on SPARK-13857:
----------------------------------------
Do we want to support user-user and item-item similarity computation too? It's
expensive in general (in the case of a small item set, one can broadcast the
item vectors, or use colSimilarity on a transposed {{RowMatrix}}, but this is
not that feasible in large-scale cases). But it's not necessarily more
expensive than top-k items for each user (depending on the user and item sizes
involved). Or at least, if we offer user-item top-k, then is there a reason
_not_ to offer item-item top-k similar items?
> Feature parity for ALS ML with MLLIB
> ------------------------------------
>
> Key: SPARK-13857
> URL: https://issues.apache.org/jira/browse/SPARK-13857
> Project: Spark
> Issue Type: Sub-task
> Components: ML
> Reporter: Nick Pentreath
> Assignee: Nick Pentreath
>
> Currently {{mllib.recommendation.MatrixFactorizationModel}} has methods
> {{recommendProducts/recommendUsers}} for recommending top K to a given user /
> item, as well as {{recommendProductsForUsers/recommendUsersForProducts}} to
> recommend top K across all users/items.
> Additionally, SPARK-10802 is for adding the ability to do
> {{recommendProductsForUsers}} for a subset of users (or vice versa).
> Look at exposing or porting (as appropriate) these methods to ALS in ML.
> Investigate if efficiency can be improved at the same time (see SPARK-11968).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]