[GitHub] spark pull request: [MLLIB][SPARK-4675] Find similar products and ...

debasish83 Tue, 05 May 2015 07:38:53 -0700

Github user debasish83 commented on the pull request:

    https://github.com/apache/spark/pull/3536#issuecomment-99098372
  
    @MLnick yes that's what I did...I have to convince users why use factor 
vectors :-) For user->item recommendation, convincing is easy by showing the 
ranking improvement through ALS
    
    @srowen without coming up with a validation strategy, someone might propose 
to run a different algorithm (KMeans on raw feature space followed by 
(item->cluster) join (cluster->items)) and claims his item->item results are 
better...how do we know whether ALS based flow is producing better result or 
KMeans based flow ? NNALS can be thought of soft-kmeans as well and so these 
flows are very similar.
    
    I am focused on implicit feedback here because then only we can run either 
KMeans or Similarity on raw feature space...With explicit feedback, I agree 
that cosine similarity is not valid in original feature space. But in most 
practical datasets, we are dealing with implicit feedback.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [MLLIB][SPARK-4675] Find similar products and ...

Reply via email to