Github user debasish83 commented on the pull request:
https://github.com/apache/spark/pull/3536#issuecomment-99098372
@MLnick yes that's what I did...I have to convince users why use factor
vectors :-) For user->item recommendation, convincing is easy by showing the
ranking improvement through ALS
@srowen without coming up with a validation strategy, someone might propose
to run a different algorithm (KMeans on raw feature space followed by
(item->cluster) join (cluster->items)) and claims his item->item results are
better...how do we know whether ALS based flow is producing better result or
KMeans based flow ? NNALS can be thought of soft-kmeans as well and so these
flows are very similar.
I am focused on implicit feedback here because then only we can run either
KMeans or Similarity on raw feature space...With explicit feedback, I agree
that cosine similarity is not valid in original feature space. But in most
practical datasets, we are dealing with implicit feedback.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]