Github user srowen commented on the pull request:

    https://github.com/apache/spark/pull/1687#issuecomment-50775834
  
    Thanks @mengxr I agree with all of that and will update the PR. `Rating` is 
a good solution; there's a redundant field but very few objects are returned 
anyway. Sorry I'm being dense but which RDD should be set to `MEMORY_AND_DISK`? 
the `scored` RDD in my PR? and how would you set partitions?
    
    Yes if there were a topByKey it would be natural to expose a small batch 
recommend feature here. There are other possible operations here like 
`mostSimilar` but we can leave that for another PR after discussing what the 
metric should be -- cosine similarity? etc.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to