I'm not doing too well here.
I followed the instructions supplied here, with a current .3 dev tree
and the correct hadoop, and got the following. I definitely my
0.3-SNAPSHOT Driver from my local build. This is assuming, of course,
that I can just feed the vectors from
org.apache.mahout.utils.vectors.lucene.Driver straight into the sample
KMeans job. Maybe I need to feed them directly to the KMeans class,
instead?
Preparing Input
09/12/19 16:45:58 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
09/12/19 16:45:58 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the
same.
09/12/19 16:45:59 INFO mapred.FileInputFormat: Total input paths to process : 1
09/12/19 16:45:59 INFO mapred.JobClient: Running job: job_local_0001
09/12/19 16:45:59 INFO mapred.FileInputFormat: Total input paths to process : 1
09/12/19 16:45:59 INFO mapred.MapTask: numReduceTasks: 0
09/12/19 16:45:59 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NumberFormatException: For input string:
"SEQ!org.apache.hadoop.io.LongWritable#org.apache.mahout.math.SparseVector*org.apache.hadoop.io.compress.DefaultCodec?A?d??"
at
sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1224)
at java.lang.Double.valueOf(Double.java:475)
at
org.apache.mahout.clustering.syntheticcontrol.canopy.InputMapper.map(InputMapper.java:51)
at
org.apache.mahout.clustering.syntheticcontrol.canopy.InputMapper.map(InputMapper.java:36)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:176)