Hi all, I'm new to Mahout and I've been going through the MiA book, lately I've been trying Chapter 10's example of NewsKMeansClustering as it looks like a good starting point for my own stuff but I've run into a problem just trying to run and view the output.
I'm trying to view the output of running the java file via the cluster dump utility but all I get out of it is an empty text file. I'm using MiA-mahout-0.6 and mahout-distribution-0.6. This is the process I went trough to get to this point. Get the reuters data and put it into seqfiles. (I issue these commands to bin/mahout in the mahout-distribution-0.6 project) mvn -e -q exec:java -Dexec.mainClass="org.apache.lucene.benchmark.utils.ExtractReuters" -Dexec.args="reuters/ reuters-extracted/" bin/mahout seqdirectory -c UTF-8 -i examples/reuters-extracted/ -o reuters-seqfiles I (manually - drag and drop) move the seq files to MiA (0.6) project into the folder reuters-seqfiles. I then run MiA example of NewsKMeansClustering from chapter 10 which results in a folder newsClusters being created and populated with various files (clusters folder, dictionary.file-0, centroids folder, etc) There doesn't appear to be any unusual errors in the console 2013-01-30 11:15:42.593 java[11011:1903] Unable to load realm info from SCDynamicStore SLF4J: The requested version 1.5.11 by your slf4j binding is not compatible with [1.6] SLF4J: See http://www.slf4j.org/codes.html#version_mismatch for further details. 2013-01-30 11:15:45 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. . (same as above line) . . 2013-01-30 11:16:55 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2013-01-30 11:16:56 JobClient [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. .(same as above line) . . I then run the cluster dump command to create an output.txt file. ../mahout-distribution-0.6/bin/mahout clusterdump -s newsClusters/clusters/clusters-19/ -o output.txt -d newsClusters/dictionary.file-0 -dt sequencefile -n 10 but all this does is create an empty text file. Any help would be much appreciated. Thanks, Chris
