[
https://issues.apache.org/jira/browse/MATH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13817399#comment-13817399
]
Thomas Neidhart commented on MATH-1031:
---------------------------------------
Yes indeed, I plan to commit this change soon.
btw. there is also MATH-959 to add a hierarchical clusterer to CM.
I have already added a preliminary patch which contains an optimial algorithm
for single-link.
The remaining link-methods are still to be implemented or will be implemented
with a naive algorithm which is straight-forwards.
In case you have some interest in this.
> Refactoring: Move variance calculation of a centroid cluster to its class
> -------------------------------------------------------------------------
>
> Key: MATH-1031
> URL: https://issues.apache.org/jira/browse/MATH-1031
> Project: Commons Math
> Issue Type: Improvement
> Affects Versions: 3.2
> Reporter: Thorsten Schäfer
> Priority: Minor
> Attachments: centroid.patch
>
>
> Users might be interested in assessing the quality of each cluster in the
> calculated clustering. This can be performed by calculating its variance.
> The variance calculation is actually performed in other places (e.g. for the
> MultiKMeans), but not available to end users.
> I'd propose to add the functionality into the CentroidCluster. The one issue
> to consider is that the cluster does not know based on which distance measure
> it was calculated. In the implementation, I chose to parametrize the method
> with a distance measure which enables users to also compare the quality based
> on various distance measures. Alternatively, it would be possible to add the
> distance measure as a field, which is set by the clustering algorithm.
> In the patch I went for the first method and also changed the 2 other places
> where variance calculation is performed to use the new feature.
--
This message was sent by Atlassian JIRA
(v6.1#6144)