I am successfully able to run ClusteringUtils on Kmeans(needs to check the
scenario which you have mentionbed). However I am getting error from
TDigest class
Exception in thread "main" java.lang.NoSuchMethodError:
com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque;
at org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171)
at org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169)
at org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14)
at org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317)
at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105)
at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88)
at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76)
at
org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57)
at
org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65)
Few days ago I saw a post where an user got a similar issue on TDigest
class. Ted suggested to replace the line with below code
stack = new ArrayDeque<GroupTree>();
Let me know if I am correct.
On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi <[email protected]>wrote:
> U could call ClusterQualitySummarizer which then calls ClusteringUtils to
> spew out the different metrics u had specified.
> For an example, see the Streaming Kmeans section in
> examples/bin/cluster-reuters.sh.
>
> It calls 'qualcluster' with options -i <tf-idf vectors generated from
> seq2sparse> -c <output of Kmeans> -o <output file generated with the
> metrics>
>
>
> I have not tried this on KMeans and since the output format of KMeans is
> different from Streaming KMeans, this might just fall flat.
> Also it may fail to read some of the clusters if the clusters have only a
> single clusteredpoint, this is due to new TDigest summarizer that expects
> atleast 2 points in order to calculate - max, quartiles, mean.
>
>
>
>
>
>
>
>
> On Sunday, March 9, 2014 4:19 AM, Bikash Gupta <[email protected]>
> wrote:
>
> Hi,
>
> I want to use ClusteringUtils on Kmeans clusteredPoints to get
> summarizeClusterDistances , daviesBouldinIndex & dunnIndex
>
> Is there any sample or example how to use these features?
> --
> Thanks & Regards
> Bikash Kumar Gupta
>
--
Thanks & Regards
Bikash Kumar Gupta