See <https://builds.apache.org/job/Mahout-Examples-Classify-20News/156/changes>
Changes: [ssc] one more fix [ssc] small fixes and javadoc [ssc] MAHOUT-1167 Parallel item similarity precomputation on a single machine ------------------------------------------ [...truncated 5172 lines...] Mar 20, 2013 1:34:59 PM org.apache.hadoop.filecache.TrackerDistributedCacheManager downloadCacheObject INFO: Creating model in /tmp/hadoop-jenkins/mapred/local/archive/-328114994203081503_1657971008_134367155/file/tmp/mahout-work-jenkins-work--99103836673674929 with rwxr-xr-x Mar 20, 2013 1:34:59 PM org.apache.hadoop.filecache.TrackerDistributedCacheManager downloadCacheObject INFO: Cached /tmp/mahout-work-jenkins/model as /tmp/hadoop-jenkins/mapred/local/archive/-328114994203081503_1657971008_134367155/file/tmp/mahout-work-jenkins/model Mar 20, 2013 1:34:59 PM org.apache.hadoop.filecache.TrackerDistributedCacheManager localizePublicCacheObject INFO: Cached /tmp/mahout-work-jenkins/model as /tmp/hadoop-jenkins/mapred/local/archive/-328114994203081503_1657971008_134367155/file/tmp/mahout-work-jenkins/model Mar 20, 2013 1:34:59 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Running job: job_local_0001 Mar 20, 2013 1:34:59 PM org.apache.hadoop.util.ProcessTree isSetsidSupported INFO: setsid exited with exit code 0 Mar 20, 2013 1:34:59 PM org.apache.hadoop.mapred.Task initialize INFO: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1d3cdaa Mar 20, 2013 1:34:59 PM org.apache.hadoop.io.compress.CodecPool getDecompressor INFO: Got brand-new decompressor Mar 20, 2013 1:35:00 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 0% reduce 0% Mar 20, 2013 1:35:05 PM org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate INFO: Mar 20, 2013 1:35:06 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 75% reduce 0% Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Task done INFO: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate INFO: Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Task commit INFO: Task attempt_local_0001_m_000000_0 is allowed to commit now Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask INFO: Saved output of task 'attempt_local_0001_m_000000_0' to /tmp/mahout-work-jenkins/20news-testing Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate INFO: Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Task sendDone INFO: Task 'attempt_local_0001_m_000000_0' done. Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 100% reduce 0% Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Job complete: job_local_0001 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Counters: 12 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: File Output Format Counters Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Bytes Written=1428017 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: File Input Format Counters Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Bytes Read=8572070 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: FileSystemCounters Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: FILE_BYTES_READ=78666904 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: FILE_BYTES_WRITTEN=68409649 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Map-Reduce Framework Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Map input records=7524 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Physical memory (bytes) snapshot=0 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Spilled Records=0 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Total committed heap usage (bytes)=119275520 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: CPU time spent (ms)=0 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Virtual memory (bytes) snapshot=0 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: SPLIT_RAW_BYTES=127 Mar 20, 2013 1:35:07 PM org.apache.hadoop.mapred.Counters log INFO: Map output records=7524 Mar 20, 2013 1:35:07 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Standard NB Results: ======================================================= Summary ------------------------------------------------------- Correctly Classified Instances : 6837 90.8692% Incorrectly Classified Instances : 687 9.1308% Total Classified Instances : 7524 ======================================================= Confusion Matrix ------------------------------------------------------- a b c d e f g h i j k l m n o p q r s t <--Classified as 306 0 0 0 0 0 0 0 1 0 0 0 0 0 0 3 0 0 18 1 | 329 a = alt.atheism 0 319 5 17 4 14 3 0 0 0 0 3 4 0 2 1 0 0 0 0 | 372 b = comp.graphics 1 27 260 68 14 20 6 1 0 0 0 0 3 1 2 0 0 0 0 3 | 406 c = comp.os.ms-windows.misc 0 12 4 328 26 5 14 1 0 0 0 0 8 0 0 0 0 0 0 0 | 398 d = comp.sys.ibm.pc.hardware 0 6 2 3 341 1 4 1 0 0 0 1 2 0 0 0 0 0 0 0 | 361 e = comp.sys.mac.hardware 0 14 4 3 4 352 4 0 0 1 0 1 0 0 0 0 0 0 0 0 | 383 f = comp.windows.x 2 1 5 22 10 0 316 13 2 1 3 1 9 3 1 0 0 2 0 1 | 392 g = misc.forsale 0 1 0 1 2 2 8 393 3 1 0 0 3 2 0 0 0 2 0 1 | 419 h = rec.autos 0 1 0 1 0 1 1 10 399 0 0 0 2 1 1 0 0 1 0 1 | 419 i = rec.motorcycles 0 0 0 0 1 0 1 1 1 388 3 1 2 3 0 0 0 0 0 0 | 401 j = rec.sport.baseball 2 1 0 0 2 0 2 0 2 2 371 0 0 0 0 1 0 0 0 1 | 384 k = rec.sport.hockey 1 7 1 2 0 3 1 0 0 0 0 391 2 2 0 0 0 3 0 2 | 415 l = sci.crypt 0 6 0 13 8 1 3 3 0 1 0 3 334 2 0 0 0 0 1 0 | 375 m = sci.electronics 1 1 1 2 1 0 3 0 2 0 0 1 2 386 3 0 1 1 1 2 | 408 n = sci.med 1 2 0 0 2 0 1 0 0 0 0 0 0 1 398 0 3 0 0 1 | 409 o = sci.space 5 1 0 0 0 0 0 0 0 0 0 0 0 3 1 403 1 0 6 2 | 422 p = soc.religion.christian 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 2 354 1 0 0 | 358 q = talk.politics.mideast 0 0 0 0 0 1 1 0 1 0 0 3 0 0 0 0 1 328 1 4 | 340 r = talk.politics.guns 15 0 0 0 0 1 0 1 0 0 0 0 0 0 1 7 2 3 201 4 | 235 s = talk.religion.misc 1 2 0 0 0 0 0 0 0 1 0 3 0 1 3 1 5 10 2 269 | 298 t = talk.politics.misc Mar 20, 2013 1:35:07 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 9456 ms (Minutes: 0.1576) + echo 1 + ./examples/bin/classify-20newsgroups.sh Please select a number to choose the corresponding task to run 1. cnaivebayes 2. naivebayes 3. sgd 4. clean -- cleans up the work area in /tmp/mahout-work-jenkins ok. You chose 1 and we'll use cnaivebayes creating work directory at /tmp/mahout-work-jenkins + echo 'Preparing 20newsgroups data' Preparing 20newsgroups data + rm -rf /tmp/mahout-work-jenkins/20news-all + mkdir /tmp/mahout-work-jenkins/20news-all + cp -R /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/alt.atheism /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.graphics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.os.ms-windows.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.sys.ibm.pc.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.sys.mac.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/comp.windows.x /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/misc.forsale /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.autos /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.motorcycles /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.sport.baseball /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/rec.sport.hockey /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.crypt /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.electronics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.med /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/sci.space /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/soc.religion.christian /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.guns /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.mideast /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.politics.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-test/talk.religion.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/alt.atheism /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.graphics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.os.ms-windows.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.sys.ibm.pc.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.sys.mac.hardware /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/comp.windows.x /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/misc.forsale /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.autos /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.motorcycles /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.sport.baseball /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/rec.sport.hockey /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.crypt /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.electronics /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.med /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/sci.space /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/soc.religion.christian /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.guns /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.mideast /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.politics.misc /tmp/mahout-work-jenkins/20news-bydate/20news-bydate-train/talk.religion.misc /tmp/mahout-work-jenkins/20news-all + echo 'Creating sequence files from 20newsgroups data' Creating sequence files from 20newsgroups data + ./bin/mahout seqdirectory -i /tmp/mahout-work-jenkins/20news-all -o /tmp/mahout-work-jenkins/20news-seq hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] Mar 20, 2013 1:35:09 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/tmp/mahout-work-jenkins/20news-all], --keyPrefix=[], --output=[/tmp/mahout-work-jenkins/20news-seq], --startPhase=[0], --tempDir=[temp]} Mar 20, 2013 1:35:09 PM org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Mar 20, 2013 1:35:13 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Program took 4850 ms (Minutes: 0.08083333333333333) + echo 'Converting sequence files to vectors' Converting sequence files to vectors + ./bin/mahout seq2sparse -i /tmp/mahout-work-jenkins/20news-seq -o /tmp/mahout-work-jenkins/20news-vectors -lnorm -nv -wt tfidf hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: Found binding in [jar:<https://builds.apache.org/job/Mahout-Examples-Classify-20News/ws/trunk/examples/target/dependency/slf4j-jcl-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.JCLLoggerFactory] Mar 20, 2013 1:35:14 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Maximum n-gram size is: 1 Mar 20, 2013 1:35:14 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Minimum LLR value: 1.0 Mar 20, 2013 1:35:14 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Number of reduce tasks: 1 Mar 20, 2013 1:35:14 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Deleting /tmp/mahout-work-jenkins/20news-vectors/tokenized-documents Mar 20, 2013 1:35:14 PM org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Mar 20, 2013 1:35:15 PM org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus INFO: Total input paths to process : 1 Mar 20, 2013 1:35:15 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Running job: job_local_0001 Mar 20, 2013 1:35:15 PM org.apache.hadoop.util.ProcessTree isSetsidSupported INFO: setsid exited with exit code 0 Mar 20, 2013 1:35:15 PM org.apache.hadoop.mapred.Task initialize INFO: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@21e554 Mar 20, 2013 1:35:16 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 0% reduce 0% Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapred.Task done INFO: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate INFO: Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapred.Task commit INFO: Task attempt_local_0001_m_000000_0 is allowed to commit now Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask INFO: Saved output of task 'attempt_local_0001_m_000000_0' to /tmp/mahout-work-jenkins/20news-vectors/tokenized-documents Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapred.LocalJobRunner$Job statusUpdate INFO: Mar 20, 2013 1:35:19 PM org.apache.hadoop.mapred.Task sendDone INFO: Task 'attempt_local_0001_m_000000_0' done. Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 100% reduce 0% Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Job complete: job_local_0001 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Counters: 12 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: File Output Format Counters Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Bytes Written=27717956 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: File Input Format Counters Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Bytes Read=36979301 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: FileSystemCounters Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: FILE_BYTES_READ=99792938 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: FILE_BYTES_WRITTEN=91057970 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Map-Reduce Framework Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Map input records=18846 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Physical memory (bytes) snapshot=0 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Spilled Records=0 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Total committed heap usage (bytes)=477626368 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: CPU time spent (ms)=0 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Virtual memory (bytes) snapshot=0 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: SPLIT_RAW_BYTES=113 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Counters log INFO: Map output records=18846 Mar 20, 2013 1:35:20 PM org.slf4j.impl.JCLLoggerAdapter info INFO: Deleting /tmp/mahout-work-jenkins/20news-vectors/wordcount Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus INFO: Total input paths to process : 1 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Running job: job_local_0002 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.Task initialize INFO: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@9a9b65 Mar 20, 2013 1:35:20 PM org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local_0002 java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:412) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) Mar 20, 2013 1:35:21 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 0% reduce 0% Mar 20, 2013 1:35:21 PM org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Job complete: job_local_0002 Mar 20, 2013 1:35:21 PM org.apache.hadoop.mapred.Counters log INFO: Counters: 0 Exception in thread "main" java.lang.IllegalStateException: Job failed! at org.apache.mahout.vectorizer.DictionaryVectorizer.startWordCounting(DictionaryVectorizer.java:360) at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:171) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.run(SparseVectorsFromSequenceFiles.java:273) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles.main(SparseVectorsFromSequenceFiles.java:56) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) Build step 'Execute shell' marked build as failure
