GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/19668
[SPARK-22440][ML] Add Calinski-Harabasz index to ClusteringEvaluator
## What changes were proposed in this pull request?
sklearn contains two metrics for unsupervised clustering evaluation. One is
silhouette, which has been previously added, and the other one is
Calinski-Harabasz index.
This PR aims to add Calinski-Harabasz index in order to reach feature
parity with sklearn.
## How was this patch tested?
added UT
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgaido91/spark SPARK-22440
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19668.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19668
----
commit a4c4ff190235489c080fa69b62264fe73fbc4833
Author: Marco Gaido <[email protected]>
Date: 2017-11-03T17:54:12Z
initial impl
commit 95a6e4ec6743536a7a1c0f9effd3628cd2b1adef
Author: Marco Gaido <[email protected]>
Date: 2017-11-05T11:07:04Z
adding doc
commit ca7797f3c73678b316c7630639826952910f5b8e
Author: Marco Gaido <[email protected]>
Date: 2017-11-06T09:58:25Z
added ut
commit 8a8d016599b3e940934cc4a53d93cbfa081f6279
Author: Marco Gaido <[email protected]>
Date: 2017-11-06T09:58:48Z
fixes
commit 633b21d2762877c89486b9b9e2e2ec029830bfaa
Author: Marco Gaido <[email protected]>
Date: 2017-11-06T10:53:49Z
minor
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]