Darn. U r the second guy to report that this week. Change that line to what ted suggested. The issue is with guava incompatibility with Hadoop's antiquated guava version.
Sent from my iPhone > On Mar 9, 2014, at 6:10 AM, Bikash Gupta <[email protected]> wrote: > > I am successfully able to run ClusteringUtils on Kmeans(needs to check the > scenario which you have mentionbed). However I am getting error from TDigest > class > > Exception in thread "main" java.lang.NoSuchMethodError: > com.google.common.collect.Queues.newArrayDeque()Ljava/util/ArrayDeque; > at org.apache.mahout.math.stats.GroupTree$1.<init>(GroupTree.java:171) > at org.apache.mahout.math.stats.GroupTree.iterator(GroupTree.java:169) > at org.apache.mahout.math.stats.GroupTree.access$300(GroupTree.java:14) > at org.apache.mahout.math.stats.GroupTree$2.iterator(GroupTree.java:317) > at org.apache.mahout.math.stats.TDigest.add(TDigest.java:105) > at org.apache.mahout.math.stats.TDigest.add(TDigest.java:88) > at org.apache.mahout.math.stats.TDigest.add(TDigest.java:76) > at > org.apache.mahout.math.stats.OnlineSummarizer.add(OnlineSummarizer.java:57) > at > org.apache.mahout.clustering.ClusteringUtils.summarizeClusterDistances(ClusteringUtils.java:65) > > Few days ago I saw a post where an user got a similar issue on TDigest class. > Ted suggested to replace the line with below code > > stack = new ArrayDeque<GroupTree>(); > > Let me know if I am correct. > > >> On Sun, Mar 9, 2014 at 3:18 PM, Suneel Marthi <[email protected]> >> wrote: >> U could call ClusterQualitySummarizer which then calls ClusteringUtils to >> spew out the different metrics u had specified. >> For an example, see the Streaming Kmeans section in >> examples/bin/cluster-reuters.sh. >> >> It calls 'qualcluster' with options -i <tf-idf vectors generated from >> seq2sparse> -c <output of Kmeans> -o <output file generated with the metrics> >> >> >> I have not tried this on KMeans and since the output format of KMeans is >> different from Streaming KMeans, this might just fall flat. >> Also it may fail to read some of the clusters if the clusters have only a >> single clusteredpoint, this is due to new TDigest summarizer that expects >> atleast 2 points in order to calculate - max, quartiles, mean. >> >> >> >> >> >> >> >> >> On Sunday, March 9, 2014 4:19 AM, Bikash Gupta <[email protected]> >> wrote: >> >> Hi, >> >> I want to use ClusteringUtils on Kmeans clusteredPoints to get >> summarizeClusterDistances , daviesBouldinIndex & dunnIndex >> >> Is there any sample or example how to use these features? >> -- >> Thanks & Regards >> Bikash Kumar Gupta > > > > -- > Thanks & Regards > Bikash Kumar Gupta
