Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1964#issuecomment-53547074
@yu-iskw Sorry for the delay in code review! What do you expect users to do
with the distances?
For example, users can pick different distance measures in k-means. In that
case, we should hide the distance implementation from users, and let users
specify the distance type by its string name. So we can easily extend it to
PySpark.
Another use case is to let users compute various distance measures with
MLlib's vectors. We try to keep MLlib's linear algebra implementation
lightweight, given the fact that there are many linear algebra libraries, e.g.,
Breeze. In this case, it may be useful to contribute a `dist(v1, v2, type)`
operator to Breeze and then call it in MLlib's algorithms. @dlwh
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]