No, a dictionary is not a file of 'crisp keywords' to clusters mapping. A dictionary is a mapping of keywords to a unique integerId.
I again ask that it would be easier to help, if u can outline the steps u had done for generating the clusters. Seems like u might have missed something, at the very least look at the kmeans example in examples/bin/cluster-reuters.sh for the correct sequence of steps. On Thu, Jun 26, 2014 at 5:07 AM, venkata ramana <[email protected] > wrote: > As per my understanding dictionary file contains crisp keywords which are > related to cluster. Please let me know if I am wrong. > > Thanks, > Venkat > > > On Thu, Jun 26, 2014 at 1:27 PM, Suneel Marthi <[email protected]> wrote: > > > Its clear from the stacktrace that u have a String as key where an > integer > > was expected. > > How did u go about building ur clusters from original input ? > > > > > > On Thu, Jun 26, 2014 at 3:28 AM, venkata ramana < > > [email protected] > > > wrote: > > > > > Hi Mahout, > > > > > > I am trying to analysis my k-means cluster. I have used following > > command. > > > > > > mahout clusterdump -i /opt/49-classification/cluster-centroids -o > > > /opt/49-classification/kmeans-cluster-output/clusteranalyze1.txt -p > > > /opt/49-classification/kmeans-cluster-output/clusteredPoints/ -d > > > /root/Desktop/final_feature_dictionaries.txt -dt text -e; > > > > > > I got the following error. > > > > > > hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running > > > locally > > > SLF4J: Class path contains multiple SLF4J bindings. > > > SLF4J: Found binding in > > > > > > > > > [jar:file:/opt/Gouri_Sankar/mahout-distribution-0.8/mahout-examples-0.8-job.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > > SLF4J: Found binding in > > > > > > > > > [jar:file:/opt/Gouri_Sankar/mahout-distribution-0.8/lib/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > > > explanation. > > > SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] > > > Jun 26, 2014 12:43:40 PM org.slf4j.impl.JCLLoggerAdapter info > > > INFO: Command line arguments: > > > {--dictionary=[/root/Desktop/final_feature_dictionaries.txt], > > > --dictionaryType=[text], > > > > > > > > > --distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure], > > > --endPhase=[2147483647], --evaluate=null, > > > --input=[/opt/49-classification/cluster-centroids], > > > > > > > > > --output=[/opt/49-classification/kmeans-cluster-output/clusteranalyze1.txt], > > > --outputFormat=[TEXT], > > > > > > > > > --pointsDir=[/opt/49-classification/kmeans-cluster-output/clusteredPoints/], > > > --startPhase=[0], --tempDir=[temp]} > > > Exception in thread "main" java.lang.NumberFormatException: For input > > > string: "aajproperty.com" > > > at > > > > > > > > > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > > > at java.lang.Integer.parseInt(Integer.java:492) > > > at java.lang.Integer.parseInt(Integer.java:527) > > > at > > > > > > > > > org.apache.mahout.utils.vectors.VectorHelper.loadTermDictionary(VectorHelper.java:218) > > > > > > > > > I have not used any numbers in my dictionary file. Could you please > help > > me > > > on this. > > > > > > Thanks, > > > Venkat > > > > > >
