See <https://builds.apache.org/job/Mahout-Examples-Cluster-Reuters/222/changes>

Changes:

[srowen] Improve random sampling from an iterator by choosing a number of 
elements to skip from a negative binomial distribution instead of actually 
conducting a lot of trials. Much faster when sampling rate is near 0.

------------------------------------------
[...truncated 5049 lines...]
12/11/27 20:43:01 INFO mapred.LocalJobRunner: 
12/11/27 20:43:01 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is 
allowed to commit now
12/11/27 20:43:01 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0003_r_000000_0' to 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/11/27 20:43:03 INFO mapred.LocalJobRunner: reduce > reduce
12/11/27 20:43:03 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done.
12/11/27 20:43:03 INFO mapred.JobClient:  map 100% reduce 100%
12/11/27 20:43:03 INFO mapred.JobClient: Job complete: job_local_0003
12/11/27 20:43:03 INFO mapred.JobClient: Counters: 17
12/11/27 20:43:03 INFO mapred.JobClient:   File Output Format Counters 
12/11/27 20:43:03 INFO mapred.JobClient:     Bytes Written=102
12/11/27 20:43:03 INFO mapred.JobClient:   FileSystemCounters
12/11/27 20:43:03 INFO mapred.JobClient:     FILE_BYTES_READ=361111959
12/11/27 20:43:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=364137466
12/11/27 20:43:03 INFO mapred.JobClient:   File Input Format Counters 
12/11/27 20:43:03 INFO mapred.JobClient:     Bytes Read=101
12/11/27 20:43:03 INFO mapred.JobClient:   Map-Reduce Framework
12/11/27 20:43:03 INFO mapred.JobClient:     Map output materialized bytes=6
12/11/27 20:43:03 INFO mapred.JobClient:     Map input records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/11/27 20:43:03 INFO mapred.JobClient:     Spilled Records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Map output bytes=0
12/11/27 20:43:03 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=573046784
12/11/27 20:43:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=159
12/11/27 20:43:03 INFO mapred.JobClient:     Combine input records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Reduce input records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Reduce input groups=0
12/11/27 20:43:03 INFO mapred.JobClient:     Combine output records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Reduce output records=0
12/11/27 20:43:03 INFO mapred.JobClient:     Map output records=0
12/11/27 20:43:03 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tf-vectors
12/11/27 20:43:04 INFO input.FileInputFormat: Total input paths to process : 1
12/11/27 20:43:04 INFO mapred.JobClient: Running job: job_local_0004
12/11/27 20:43:04 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:04 INFO mapred.MapTask: io.sort.mb = 100
12/11/27 20:43:04 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/27 20:43:04 INFO mapred.MapTask: record buffer = 262144/327680
12/11/27 20:43:04 INFO mapred.MapTask: Starting flush of map output
12/11/27 20:43:04 INFO mapred.Task: Task:attempt_local_0004_m_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:05 INFO mapred.JobClient:  map 0% reduce 0%
12/11/27 20:43:07 INFO mapred.LocalJobRunner: 
12/11/27 20:43:07 INFO mapred.Task: Task 'attempt_local_0004_m_000000_0' done.
12/11/27 20:43:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:07 INFO mapred.LocalJobRunner: 
12/11/27 20:43:07 INFO mapred.Merger: Merging 1 sorted segments
12/11/27 20:43:07 INFO mapred.Merger: Down to the last merge-pass, with 0 
segments left of total size: 0 bytes
12/11/27 20:43:07 INFO mapred.LocalJobRunner: 
12/11/27 20:43:07 INFO mapred.Task: Task:attempt_local_0004_r_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:07 INFO mapred.LocalJobRunner: 
12/11/27 20:43:07 INFO mapred.Task: Task attempt_local_0004_r_000000_0 is 
allowed to commit now
12/11/27 20:43:07 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0004_r_000000_0' to 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tf-vectors
12/11/27 20:43:07 INFO mapred.JobClient:  map 100% reduce 0%
12/11/27 20:43:10 INFO mapred.LocalJobRunner: reduce > reduce
12/11/27 20:43:10 INFO mapred.Task: Task 'attempt_local_0004_r_000000_0' done.
12/11/27 20:43:10 INFO mapred.JobClient:  map 100% reduce 100%
12/11/27 20:43:10 INFO mapred.JobClient: Job complete: job_local_0004
12/11/27 20:43:10 INFO mapred.JobClient: Counters: 17
12/11/27 20:43:10 INFO mapred.JobClient:   File Output Format Counters 
12/11/27 20:43:10 INFO mapred.JobClient:     Bytes Written=102
12/11/27 20:43:10 INFO mapred.JobClient:   FileSystemCounters
12/11/27 20:43:10 INFO mapred.JobClient:     FILE_BYTES_READ=481482584
12/11/27 20:43:10 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=485516104
12/11/27 20:43:10 INFO mapred.JobClient:   File Input Format Counters 
12/11/27 20:43:10 INFO mapred.JobClient:     Bytes Read=102
12/11/27 20:43:10 INFO mapred.JobClient:   Map-Reduce Framework
12/11/27 20:43:10 INFO mapred.JobClient:     Map output materialized bytes=6
12/11/27 20:43:10 INFO mapred.JobClient:     Map input records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/11/27 20:43:10 INFO mapred.JobClient:     Spilled Records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Map output bytes=0
12/11/27 20:43:10 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=774373376
12/11/27 20:43:10 INFO mapred.JobClient:     SPLIT_RAW_BYTES=157
12/11/27 20:43:10 INFO mapred.JobClient:     Combine input records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Reduce input records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Reduce input groups=0
12/11/27 20:43:10 INFO mapred.JobClient:     Combine output records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Reduce output records=0
12/11/27 20:43:10 INFO mapred.JobClient:     Map output records=0
12/11/27 20:43:10 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/11/27 20:43:10 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/df-count
12/11/27 20:43:11 INFO input.FileInputFormat: Total input paths to process : 1
12/11/27 20:43:11 INFO mapred.JobClient: Running job: job_local_0005
12/11/27 20:43:11 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:11 INFO mapred.MapTask: io.sort.mb = 100
12/11/27 20:43:11 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/27 20:43:11 INFO mapred.MapTask: record buffer = 262144/327680
12/11/27 20:43:11 INFO mapred.MapTask: Starting flush of map output
12/11/27 20:43:11 INFO mapred.Task: Task:attempt_local_0005_m_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:12 INFO mapred.JobClient:  map 0% reduce 0%
12/11/27 20:43:14 INFO mapred.LocalJobRunner: 
12/11/27 20:43:14 INFO mapred.Task: Task 'attempt_local_0005_m_000000_0' done.
12/11/27 20:43:14 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:14 INFO mapred.LocalJobRunner: 
12/11/27 20:43:14 INFO mapred.Merger: Merging 1 sorted segments
12/11/27 20:43:14 INFO mapred.Merger: Down to the last merge-pass, with 0 
segments left of total size: 0 bytes
12/11/27 20:43:14 INFO mapred.LocalJobRunner: 
12/11/27 20:43:14 INFO mapred.Task: Task:attempt_local_0005_r_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:14 INFO mapred.LocalJobRunner: 
12/11/27 20:43:14 INFO mapred.Task: Task attempt_local_0005_r_000000_0 is 
allowed to commit now
12/11/27 20:43:14 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0005_r_000000_0' to 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/df-count
12/11/27 20:43:14 INFO mapred.JobClient:  map 100% reduce 0%
12/11/27 20:43:17 INFO mapred.LocalJobRunner: reduce > reduce
12/11/27 20:43:17 INFO mapred.Task: Task 'attempt_local_0005_r_000000_0' done.
12/11/27 20:43:17 INFO mapred.JobClient:  map 100% reduce 100%
12/11/27 20:43:17 INFO mapred.JobClient: Job complete: job_local_0005
12/11/27 20:43:17 INFO mapred.JobClient: Counters: 17
12/11/27 20:43:17 INFO mapred.JobClient:   File Output Format Counters 
12/11/27 20:43:17 INFO mapred.JobClient:     Bytes Written=105
12/11/27 20:43:17 INFO mapred.JobClient:   FileSystemCounters
12/11/27 20:43:17 INFO mapred.JobClient:     FILE_BYTES_READ=601853098
12/11/27 20:43:17 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=606894223
12/11/27 20:43:17 INFO mapred.JobClient:   File Input Format Counters 
12/11/27 20:43:17 INFO mapred.JobClient:     Bytes Read=102
12/11/27 20:43:17 INFO mapred.JobClient:   Map-Reduce Framework
12/11/27 20:43:17 INFO mapred.JobClient:     Map output materialized bytes=6
12/11/27 20:43:17 INFO mapred.JobClient:     Map input records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/11/27 20:43:17 INFO mapred.JobClient:     Spilled Records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Map output bytes=0
12/11/27 20:43:17 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=1042808832
12/11/27 20:43:17 INFO mapred.JobClient:     SPLIT_RAW_BYTES=150
12/11/27 20:43:17 INFO mapred.JobClient:     Combine input records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Reduce input records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Reduce input groups=0
12/11/27 20:43:17 INFO mapred.JobClient:     Combine output records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Reduce output records=0
12/11/27 20:43:17 INFO mapred.JobClient:     Map output records=0
12/11/27 20:43:18 INFO input.FileInputFormat: Total input paths to process : 1
12/11/27 20:43:18 INFO filecache.TrackerDistributedCacheManager: Creating 
frequency.file-0 in 
/tmp/hadoop-hudson/mapred/local/archive/770907475383763179_1334525619_1134299390/file/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans-work--2472476765066037464
 with rwxr-xr-x
12/11/27 20:43:18 INFO filecache.TrackerDistributedCacheManager: Cached 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/frequency.file-0 as 
/tmp/hadoop-hudson/mapred/local/archive/770907475383763179_1334525619_1134299390/file/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/frequency.file-0
12/11/27 20:43:18 INFO filecache.TrackerDistributedCacheManager: Cached 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/frequency.file-0 as 
/tmp/hadoop-hudson/mapred/local/archive/770907475383763179_1334525619_1134299390/file/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/frequency.file-0
12/11/27 20:43:18 INFO mapred.JobClient: Running job: job_local_0006
12/11/27 20:43:18 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:18 INFO mapred.MapTask: io.sort.mb = 100
12/11/27 20:43:18 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/27 20:43:18 INFO mapred.MapTask: record buffer = 262144/327680
12/11/27 20:43:18 INFO mapred.MapTask: Starting flush of map output
12/11/27 20:43:18 INFO mapred.Task: Task:attempt_local_0006_m_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:19 INFO mapred.JobClient:  map 0% reduce 0%
12/11/27 20:43:21 INFO mapred.LocalJobRunner: 
12/11/27 20:43:21 INFO mapred.Task: Task 'attempt_local_0006_m_000000_0' done.
12/11/27 20:43:21 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:21 INFO mapred.LocalJobRunner: 
12/11/27 20:43:21 INFO mapred.Merger: Merging 1 sorted segments
12/11/27 20:43:21 INFO mapred.Merger: Down to the last merge-pass, with 0 
segments left of total size: 0 bytes
12/11/27 20:43:21 INFO mapred.LocalJobRunner: 
12/11/27 20:43:21 INFO mapred.JobClient:  map 100% reduce 0%
12/11/27 20:43:21 INFO mapred.Task: Task:attempt_local_0006_r_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:21 INFO mapred.LocalJobRunner: 
12/11/27 20:43:21 INFO mapred.Task: Task attempt_local_0006_r_000000_0 is 
allowed to commit now
12/11/27 20:43:21 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0006_r_000000_0' to 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/11/27 20:43:24 INFO mapred.LocalJobRunner: reduce > reduce
12/11/27 20:43:24 INFO mapred.Task: Task 'attempt_local_0006_r_000000_0' done.
12/11/27 20:43:24 INFO mapred.JobClient:  map 100% reduce 100%
12/11/27 20:43:24 INFO mapred.JobClient: Job complete: job_local_0006
12/11/27 20:43:24 INFO mapred.JobClient: Counters: 17
12/11/27 20:43:24 INFO mapred.JobClient:   File Output Format Counters 
12/11/27 20:43:24 INFO mapred.JobClient:     Bytes Written=102
12/11/27 20:43:24 INFO mapred.JobClient:   FileSystemCounters
12/11/27 20:43:24 INFO mapred.JobClient:     FILE_BYTES_READ=722224137
12/11/27 20:43:24 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=728276184
12/11/27 20:43:24 INFO mapred.JobClient:   File Input Format Counters 
12/11/27 20:43:24 INFO mapred.JobClient:     Bytes Read=102
12/11/27 20:43:24 INFO mapred.JobClient:   Map-Reduce Framework
12/11/27 20:43:24 INFO mapred.JobClient:     Map output materialized bytes=6
12/11/27 20:43:24 INFO mapred.JobClient:     Map input records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/11/27 20:43:24 INFO mapred.JobClient:     Spilled Records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Map output bytes=0
12/11/27 20:43:24 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=1244135424
12/11/27 20:43:24 INFO mapred.JobClient:     SPLIT_RAW_BYTES=150
12/11/27 20:43:24 INFO mapred.JobClient:     Combine input records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Reduce input records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Reduce input groups=0
12/11/27 20:43:24 INFO mapred.JobClient:     Combine output records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Reduce output records=0
12/11/27 20:43:24 INFO mapred.JobClient:     Map output records=0
12/11/27 20:43:24 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/11/27 20:43:24 INFO input.FileInputFormat: Total input paths to process : 1
12/11/27 20:43:24 INFO mapred.JobClient: Running job: job_local_0007
12/11/27 20:43:24 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:24 INFO mapred.MapTask: io.sort.mb = 100
12/11/27 20:43:25 INFO mapred.MapTask: data buffer = 79691776/99614720
12/11/27 20:43:25 INFO mapred.MapTask: record buffer = 262144/327680
12/11/27 20:43:25 INFO mapred.MapTask: Starting flush of map output
12/11/27 20:43:25 INFO mapred.Task: Task:attempt_local_0007_m_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:25 INFO mapred.JobClient:  map 0% reduce 0%
12/11/27 20:43:27 INFO mapred.LocalJobRunner: 
12/11/27 20:43:27 INFO mapred.Task: Task 'attempt_local_0007_m_000000_0' done.
12/11/27 20:43:27 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
12/11/27 20:43:27 INFO mapred.LocalJobRunner: 
12/11/27 20:43:27 INFO mapred.Merger: Merging 1 sorted segments
12/11/27 20:43:27 INFO mapred.Merger: Down to the last merge-pass, with 0 
segments left of total size: 0 bytes
12/11/27 20:43:27 INFO mapred.LocalJobRunner: 
12/11/27 20:43:27 INFO mapred.Task: Task:attempt_local_0007_r_000000_0 is done. 
And is in the process of commiting
12/11/27 20:43:27 INFO mapred.LocalJobRunner: 
12/11/27 20:43:27 INFO mapred.Task: Task attempt_local_0007_r_000000_0 is 
allowed to commit now
12/11/27 20:43:27 INFO output.FileOutputCommitter: Saved output of task 
'attempt_local_0007_r_000000_0' to 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors
12/11/27 20:43:27 INFO mapred.JobClient:  map 100% reduce 0%
12/11/27 20:43:30 INFO mapred.LocalJobRunner: reduce > reduce
12/11/27 20:43:30 INFO mapred.Task: Task 'attempt_local_0007_r_000000_0' done.
12/11/27 20:43:30 INFO mapred.JobClient:  map 100% reduce 100%
12/11/27 20:43:30 INFO mapred.JobClient: Job complete: job_local_0007
12/11/27 20:43:30 INFO mapred.JobClient: Counters: 17
12/11/27 20:43:30 INFO mapred.JobClient:   File Output Format Counters 
12/11/27 20:43:30 INFO mapred.JobClient:     Bytes Written=102
12/11/27 20:43:30 INFO mapred.JobClient:   FileSystemCounters
12/11/27 20:43:30 INFO mapred.JobClient:     FILE_BYTES_READ=842594770
12/11/27 20:43:30 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=849654834
12/11/27 20:43:30 INFO mapred.JobClient:   File Input Format Counters 
12/11/27 20:43:30 INFO mapred.JobClient:     Bytes Read=102
12/11/27 20:43:30 INFO mapred.JobClient:   Map-Reduce Framework
12/11/27 20:43:30 INFO mapred.JobClient:     Map output materialized bytes=6
12/11/27 20:43:30 INFO mapred.JobClient:     Map input records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Reduce shuffle bytes=0
12/11/27 20:43:30 INFO mapred.JobClient:     Spilled Records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Map output bytes=0
12/11/27 20:43:30 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=1445462016
12/11/27 20:43:30 INFO mapred.JobClient:     SPLIT_RAW_BYTES=157
12/11/27 20:43:30 INFO mapred.JobClient:     Combine input records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Reduce input records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Reduce input groups=0
12/11/27 20:43:30 INFO mapred.JobClient:     Combine output records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Reduce output records=0
12/11/27 20:43:30 INFO mapred.JobClient:     Map output records=0
12/11/27 20:43:31 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/partial-vectors-0
12/11/27 20:43:31 INFO driver.MahoutDriver: Program took 46352 ms (Minutes: 
0.7725333333333333)
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
12/11/27 20:43:32 INFO common.AbstractJob: Command line arguments: 
{--clustering=null, 
--clusters=[/tmp/mahout-work-hudson/reuters-kmeans-clusters], 
--convergenceDelta=[0.5], 
--distanceMeasure=[org.apache.mahout.common.distance.CosineDistanceMeasure], 
--endPhase=[2147483647], 
--input=[/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors/],
 --maxIter=[10], --method=[mapreduce], --numClusters=[20], 
--output=[/tmp/mahout-work-hudson/reuters-kmeans], --overwrite=null, 
--startPhase=[0], --tempDir=[temp]}
12/11/27 20:43:32 INFO common.HadoopUtil: Deleting 
/tmp/mahout-work-hudson/reuters-kmeans-clusters
12/11/27 20:43:32 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
12/11/27 20:43:33 INFO compress.CodecPool: Got brand-new compressor
12/11/27 20:43:33 INFO kmeans.RandomSeedGenerator: Wrote 20 Klusters to 
/tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed
12/11/27 20:43:33 INFO kmeans.KMeansDriver: Input: 
/tmp/mahout-work-hudson/reuters-out-seqdir-sparse-kmeans/tfidf-vectors Clusters 
In: /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed Out: 
/tmp/mahout-work-hudson/reuters-kmeans Distance: 
org.apache.mahout.common.distance.CosineDistanceMeasure
12/11/27 20:43:33 INFO kmeans.KMeansDriver: convergence: 0.5 max Iterations: 10 
num Reduce Tasks: org.apache.mahout.math.VectorWritable Input Vectors: {}
12/11/27 20:43:33 INFO compress.CodecPool: Got brand-new decompressor
Exception in thread "main" java.lang.IllegalStateException: No input clusters 
found in /tmp/mahout-work-hudson/reuters-kmeans-clusters/part-randomSeed. Check 
your -c argument.
        at 
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:217)
        at 
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:148)
        at 
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:107)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at 
org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:48)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
Build step 'Execute shell' marked build as failure

Reply via email to