GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/20396
[SPARK-23217][ML] Add cosine distance measure to ClusteringEvaluator
## What changes were proposed in this pull request?
The PR provided an implementation of ClusteringEvaluator using the cosine
distance measure.
This allows to evaluate clustering results created using the cosine
distance, introduced in SPARK-22119.
In the corresponding JIRA, there is a design document for the algorithm
implemented here.
## How was this patch tested?
Added UT which compares the result to the one provided by python sklearn.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgaido91/spark SPARK-23217
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20396.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20396
----
commit 07a7fb0d19e8cbc7be4e9d4222b2ea24750714fd
Author: Marco Gaido <marcogaido91@...>
Date: 2018-01-25T14:59:10Z
[SPARK-23217][ML] Add cosine distance measure to ClusteringEvaluator
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]