See <https://builds.apache.org/job/Mahout-Examples-Classify-20News/83/changes>
Changes: [tdunning] MAHOUT-1063 - Integer and real attributes are handled just as any numeric attribute. ------------------------------------------ [...truncated 5091 lines...] [INFO] Installing <https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar> to /home/jenkins/.m2/repository/org/apache/mahout/mahout-examples/0.8-SNAPSHOT/mahout-examples-0.8-SNAPSHOT-job.jar [INFO] Installing <https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-sources.jar> to /home/jenkins/.m2/repository/org/apache/mahout/mahout-examples/0.8-SNAPSHOT/mahout-examples-0.8-SNAPSHOT-sources.jar [INFO] ------------------------------------------------------------------------ [INFO] Building Mahout Release Package [INFO] task-segment: [clean, install] [INFO] ------------------------------------------------------------------------ [INFO] [clean:clean {execution: default-clean}] [INFO] [site:attach-descriptor {execution: default-attach-descriptor}] [INFO] [assembly:single {execution: bin-assembly}] [INFO] Assemblies have been skipped per configuration of the skipAssembly parameter. [INFO] [assembly:single {execution: src-assembly}] [INFO] Assemblies have been skipped per configuration of the skipAssembly parameter. [INFO] [install:install {execution: default-install}] [INFO] Installing <https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/distribution/pom.xml> to /home/jenkins/.m2/repository/org/apache/mahout/mahout-distribution/0.8-SNAPSHOT/mahout-distribution-0.8-SNAPSHOT.pom [INFO] [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary: [INFO] ------------------------------------------------------------------------ [INFO] Apache Mahout ......................................... SUCCESS [2.527s] [INFO] Mahout Build Tools .................................... SUCCESS [0.895s] [INFO] Mahout Math ........................................... SUCCESS [20.374s] [INFO] Mahout Core ........................................... SUCCESS [18.625s] [INFO] Mahout Integration .................................... SUCCESS [12.782s] [INFO] Mahout Examples ....................................... SUCCESS [12.266s] [INFO] Mahout Release Package ................................ SUCCESS [0.111s] [INFO] ------------------------------------------------------------------------ [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESSFUL [INFO] ------------------------------------------------------------------------ [INFO] Total time: 1 minute 8 seconds [INFO] Finished at: Sat Sep 01 15:05:50 UTC 2012 [INFO] Final Memory: 75M/427M [INFO] ------------------------------------------------------------------------ [Mahout-Examples-Classify-20News] $ /bin/bash -xe /tmp/hudson8750828104431514924.sh + cd trunk + echo 3 + ./examples/bin/classify-20newsgroups.sh Please select a number to choose the corresponding task to run 1. cnaivebayes 2. naivebayes 3. sgd 4. clean -- cleans up the work area in /tmp/mahout-work-jenkins ok. You chose 3 and we'll use sgd creating work directory at /tmp/mahout-work-jenkins Testing on /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/ with model: /tmp/news-group.model hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 12/09/01 15:05:51 WARN driver.MahoutDriver: No org.apache.mahout.classifier.sgd.TestNewsGroups.props found on classpath, will use command-line arguments only 7532 test files ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 5657 75.1062% Incorrectly Classified Instances : 1875 24.8938% Total Classified Instances : 7532 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h i j k l m n o p q r s t <--Classified as 37 7 4 9 15 131 5 27 0 0 12 1 3 2 7 13 4 2 1 105 | 385 a = comp.sys.mac.hardware 2 243 0 15 27 66 3 0 0 7 2 3 1 3 2 1 2 3 0 14 | 394 b = comp.os.ms-windows.misc 0 0 325 1 0 2 1 3 1 1 3 1 3 7 1 2 1 5 0 7 | 364 c = talk.politics.guns 0 19 0 297 24 10 6 12 0 0 4 0 3 0 0 2 2 8 1 7 | 395 d = comp.windows.x 3 9 3 22 271 26 8 6 0 2 4 1 0 0 1 2 2 6 0 23 | 389 e = comp.graphics 0 16 4 6 7 302 1 7 0 1 0 0 0 0 3 2 2 1 0 40 | 392 f = comp.sys.ibm.pc.hardware 1 1 5 1 2 3 347 5 0 4 1 0 3 5 0 0 3 2 0 11 | 394 g = sci.space 0 0 0 0 2 16 1 351 0 0 1 0 0 0 1 5 3 1 0 9 | 390 h = misc.forsale 3 1 0 1 3 3 1 2 311 29 1 24 2 1 0 0 0 1 0 15 | 398 i = soc.religion.christian 0 0 5 0 0 3 1 2 15 259 1 14 6 1 1 0 5 2 1 3 | 319 j = alt.atheism 0 0 0 1 0 2 0 3 1 3 357 1 1 1 1 0 0 0 14 12 | 397 k = rec.sport.baseball 1 0 27 1 2 2 5 0 20 55 3 117 4 4 1 0 2 2 0 5 | 251 l = talk.religion.misc 0 0 11 1 1 0 0 2 3 18 4 0 321 5 2 2 0 1 0 5 | 376 m = talk.politics.mideast 0 0 126 1 0 1 4 1 3 8 0 1 2 153 0 0 4 3 0 3 | 310 n = talk.politics.misc 0 0 0 2 0 0 1 4 0 0 1 0 0 0 370 9 1 0 0 10 | 398 o = rec.motorcycles 0 0 4 1 3 3 2 10 1 1 4 0 0 4 15 309 2 0 0 37 | 396 p = rec.autos 1 0 6 5 3 10 3 8 5 7 4 1 4 2 5 4 261 0 0 67 | 396 q = sci.med 0 1 9 2 0 3 0 1 1 0 4 0 3 2 0 1 0 349 0 20 | 396 r = sci.crypt 1 0 5 0 0 1 1 3 0 2 19 0 2 1 3 0 1 0 357 3 | 399 s = rec.sport.hockey 0 1 0 0 7 24 8 11 0 1 2 0 0 0 3 5 2 9 0 320 | 393 t = sci.electronics Avg. Log-likelihood: -1.1267589220554224 25%-ile: -1.607494009330453 75%-ile: -0.5907892203657235 12/09/01 15:06:15 INFO driver.MahoutDriver: Program took 23398 ms (Minutes: 0.3899666666666667) + echo 2 + ./examples/bin/classify-20newsgroups.sh Please select a number to choose the corresponding task to run 1. cnaivebayes 2. naivebayes 3. sgd 4. clean -- cleans up the work area in /tmp/mahout-work-jenkins ok. You chose 2 and we'll use naivebayes creating work directory at /tmp/mahout-work-jenkins + echo 'Preparing 20newsgroups data' Preparing 20newsgroups data + rm -rf /tmp/mahout-work-jenkins/20news-all + mkdir /tmp/mahout-work-jenkins/20news-all + cp -R /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/alt.atheism /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.graphics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.os.ms-windows.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.sys.ibm.pc.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.sys.mac.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.windows.x /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/misc.forsale /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.autos /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.motorcycles /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.sport.baseball /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.sport.hockey /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.crypt /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.electronics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.med /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.space /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/soc.religion.christian /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.guns /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.mideast /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.religion.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/alt.atheism /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.graphics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.os.ms-windows.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.sys.ibm.pc.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.sys.mac.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.windows.x /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/misc.forsale /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.autos /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.motorcycles /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.sport.baseball /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.sport.hockey /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.crypt /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.electronics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.med /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.space /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/soc.religion.christian /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.guns /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.mideast /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.religion.misc /tmp/mahout-work-jenkins/20news-all + echo 'Creating sequence files from 20newsgroups data' Creating sequence files from 20newsgroups data + ./bin/mahout seqdirectory -i /tmp/mahout-work-jenkins/20news-all -o /tmp/mahout-work-jenkins/20news-seq hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 12/09/01 15:06:26 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/tmp/mahout-work-jenkins/20news-all], --keyPrefix=[], --output=[/tmp/mahout-work-jenkins/20news-seq], --startPhase=[0], --tempDir=[temp]} 12/09/01 15:06:30 INFO driver.MahoutDriver: Program took 5132 ms (Minutes: 0.08553333333333334) + echo 'Converting sequence files to vectors' Converting sequence files to vectors + ./bin/mahout seq2sparse -i /tmp/mahout-work-jenkins/20news-seq -o /tmp/mahout-work-jenkins/20news-vectors -lnorm -nv -wt tfidf hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 12/09/01 15:06:31 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum n-gram size is: 1 12/09/01 15:06:31 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum LLR value: 1.0 12/09/01 15:06:31 INFO vectorizer.SparseVectorsFromSequenceFiles: Number of reduce tasks: 1 12/09/01 15:06:31 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/20news-vectors/tokenized-documents 12/09/01 15:06:32 INFO input.FileInputFormat: Total input paths to process : 1 12/09/01 15:06:32 INFO mapred.JobClient: Running job: job_local_0001 12/09/01 15:06:33 INFO mapred.JobClient: map 0% reduce 0% 12/09/01 15:06:36 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 12/09/01 15:06:36 INFO mapred.LocalJobRunner: 12/09/01 15:06:36 INFO mapred.Task: Task attempt_local_0001_m_000000_0 is allowed to commit now 12/09/01 15:06:36 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_m_000000_0' to /tmp/mahout-work-jenkins/20news-vectors/tokenized-documents 12/09/01 15:06:38 INFO mapred.LocalJobRunner: 12/09/01 15:06:38 INFO mapred.LocalJobRunner: 12/09/01 15:06:38 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 12/09/01 15:06:39 INFO mapred.JobClient: map 100% reduce 0% 12/09/01 15:06:39 INFO mapred.JobClient: Job complete: job_local_0001 12/09/01 15:06:39 INFO mapred.JobClient: Counters: 8 12/09/01 15:06:39 INFO mapred.JobClient: File Output Format Counters 12/09/01 15:06:39 INFO mapred.JobClient: Bytes Written=27717956 12/09/01 15:06:39 INFO mapred.JobClient: File Input Format Counters 12/09/01 15:06:39 INFO mapred.JobClient: Bytes Read=36979301 12/09/01 15:06:39 INFO mapred.JobClient: FileSystemCounters 12/09/01 15:06:39 INFO mapred.JobClient: FILE_BYTES_READ=67768902 12/09/01 15:06:39 INFO mapred.JobClient: FILE_BYTES_WRITTEN=58780088 12/09/01 15:06:39 INFO mapred.JobClient: Map-Reduce Framework 12/09/01 15:06:39 INFO mapred.JobClient: Map input records=18846 12/09/01 15:06:39 INFO mapred.JobClient: Spilled Records=0 12/09/01 15:06:39 INFO mapred.JobClient: SPLIT_RAW_BYTES=113 12/09/01 15:06:39 INFO mapred.JobClient: Map output records=18846 12/09/01 15:06:39 INFO common.HadoopUtil: Deleting /tmp/mahout-work-jenkins/20news-vectors/wordcount 12/09/01 15:06:39 INFO input.FileInputFormat: Total input paths to process : 1 12/09/01 15:06:39 INFO mapred.JobClient: Running job: job_local_0002 12/09/01 15:06:39 INFO mapred.MapTask: io.sort.mb = 100 12/09/01 15:06:39 INFO mapred.MapTask: data buffer = 79691776/99614720 12/09/01 15:06:39 INFO mapred.MapTask: record buffer = 262144/327680 12/09/01 15:06:40 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:40 INFO mapred.MapTask: bufstart = 0; bufend = 3756019; bufvoid = 99614720 12/09/01 15:06:40 INFO mapred.MapTask: kvstart = 0; kvend = 262144; length = 327680 12/09/01 15:06:40 INFO mapred.JobClient: map 0% reduce 0% 12/09/01 15:06:41 INFO mapred.MapTask: Finished spill 0 12/09/01 15:06:41 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:41 INFO mapred.MapTask: bufstart = 3756019; bufend = 7655523; bufvoid = 99614720 12/09/01 15:06:41 INFO mapred.MapTask: kvstart = 262144; kvend = 196607; length = 327680 12/09/01 15:06:41 INFO mapred.MapTask: Finished spill 1 12/09/01 15:06:42 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:42 INFO mapred.MapTask: bufstart = 7655523; bufend = 11527995; bufvoid = 99614720 12/09/01 15:06:42 INFO mapred.MapTask: kvstart = 196607; kvend = 131070; length = 327680 12/09/01 15:06:42 INFO mapred.MapTask: Finished spill 2 12/09/01 15:06:42 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:42 INFO mapred.MapTask: bufstart = 11527995; bufend = 15409057; bufvoid = 99614720 12/09/01 15:06:42 INFO mapred.MapTask: kvstart = 131070; kvend = 65533; length = 327680 12/09/01 15:06:43 INFO mapred.MapTask: Finished spill 3 12/09/01 15:06:43 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:43 INFO mapred.MapTask: bufstart = 15409057; bufend = 19344187; bufvoid = 99614720 12/09/01 15:06:43 INFO mapred.MapTask: kvstart = 65533; kvend = 327677; length = 327680 12/09/01 15:06:43 INFO mapred.MapTask: Finished spill 4 12/09/01 15:06:43 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:43 INFO mapred.MapTask: bufstart = 19344187; bufend = 23225726; bufvoid = 99614720 12/09/01 15:06:43 INFO mapred.MapTask: kvstart = 327677; kvend = 262140; length = 327680 12/09/01 15:06:44 INFO mapred.MapTask: Finished spill 5 12/09/01 15:06:44 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:44 INFO mapred.MapTask: bufstart = 23225726; bufend = 27185085; bufvoid = 99614720 12/09/01 15:06:44 INFO mapred.MapTask: kvstart = 262140; kvend = 196603; length = 327680 12/09/01 15:06:44 INFO mapred.MapTask: Finished spill 6 12/09/01 15:06:45 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:45 INFO mapred.MapTask: bufstart = 27185085; bufend = 31023983; bufvoid = 99614720 12/09/01 15:06:45 INFO mapred.MapTask: kvstart = 196603; kvend = 131066; length = 327680 12/09/01 15:06:45 INFO mapred.MapTask: Finished spill 7 12/09/01 15:06:45 INFO mapred.LocalJobRunner: 12/09/01 15:06:45 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:45 INFO mapred.MapTask: bufstart = 31023983; bufend = 34976993; bufvoid = 99614720 12/09/01 15:06:45 INFO mapred.MapTask: kvstart = 131066; kvend = 65529; length = 327680 12/09/01 15:06:46 INFO mapred.MapTask: Finished spill 8 12/09/01 15:06:46 INFO mapred.MapTask: Spilling map output: record full = true 12/09/01 15:06:46 INFO mapred.MapTask: bufstart = 34976993; bufend = 38831936; bufvoid = 99614720 12/09/01 15:06:46 INFO mapred.MapTask: kvstart = 65529; kvend = 327673; length = 327680 12/09/01 15:06:46 INFO mapred.MapTask: Starting flush of map output 12/09/01 15:06:46 INFO mapred.JobClient: map 84% reduce 0% 12/09/01 15:06:46 INFO mapred.MapTask: Finished spill 9 12/09/01 15:06:46 INFO mapred.MapTask: Finished spill 10 12/09/01 15:06:46 WARN mapred.LocalJobRunner: job_local_0002 org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/spill0.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160) at org.apache.hadoop.mapred.MapOutputFile.getSpillFile(MapOutputFile.java:107) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1614) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1323) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:699) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) 12/09/01 15:06:47 INFO mapred.JobClient: Job complete: job_local_0002 12/09/01 15:06:47 INFO mapred.JobClient: Counters: 11 12/09/01 15:06:47 INFO mapred.JobClient: File Input Format Counters 12/09/01 15:06:47 INFO mapred.JobClient: Bytes Read=23478272 12/09/01 15:06:47 INFO mapred.JobClient: FileSystemCounters 12/09/01 15:06:47 INFO mapred.JobClient: FILE_BYTES_READ=122036804 12/09/01 15:06:47 INFO mapred.JobClient: FILE_BYTES_WRITTEN=95117473 12/09/01 15:06:47 INFO mapred.JobClient: Map-Reduce Framework 12/09/01 15:06:47 INFO mapred.JobClient: Map output materialized bytes=0 12/09/01 15:06:47 INFO mapred.JobClient: Combine output records=292221 12/09/01 15:06:47 INFO mapred.JobClient: Map input records=15769 12/09/01 15:06:47 INFO mapred.JobClient: Spilled Records=292221 12/09/01 15:06:47 INFO mapred.JobClient: Map output bytes=33209568 12/09/01 15:06:47 INFO mapred.JobClient: SPLIT_RAW_BYTES=142 12/09/01 15:06:47 INFO mapred.JobClient: Map output records=2241900 12/09/01 15:06:47 INFO mapred.JobClient: Combine input records=2097146 Exception in thread "main" java.lang.IllegalStateException: Job failed! at org.apache.mahout.vectorizer.DictionaryVectorizer.startWordCounting(DictionaryVectorizer.java:360) at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:171) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:272) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:55) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) Build step 'Execute shell' marked build as failure
