[ 
https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237074#comment-15237074
 ] 

Nick Pentreath commented on SPARK-13857:
----------------------------------------

Do we want to support user-user and item-item similarity computation too? It's 
expensive in general (in the case of a small item set, one can broadcast the 
item vectors, or use colSimilarity on a transposed {{RowMatrix}}, but this is 
not that feasible in large-scale cases). But it's not necessarily more 
expensive than top-k items for each user (depending on the user and item sizes 
involved). Or at least, if we offer user-item top-k, then is there a reason 
_not_ to offer item-item top-k similar items?

> Feature parity for ALS ML with MLLIB
> ------------------------------------
>
>                 Key: SPARK-13857
>                 URL: https://issues.apache.org/jira/browse/SPARK-13857
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Nick Pentreath
>            Assignee: Nick Pentreath
>
> Currently {{mllib.recommendation.MatrixFactorizationModel}} has methods 
> {{recommendProducts/recommendUsers}} for recommending top K to a given user / 
> item, as well as {{recommendProductsForUsers/recommendUsersForProducts}} to 
> recommend top K across all users/items.
> Additionally, SPARK-10802 is for adding the ability to do 
> {{recommendProductsForUsers}} for a subset of users (or vice versa).
> Look at exposing or porting (as appropriate) these methods to ALS in ML. 
> Investigate if efficiency can be improved at the same time (see SPARK-11968).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to