Hi everyone,
I am running Hadoop 1.0.4 with Mahout 0.7
I am currently trying to write a program in Java that will use the Mahout
Jar library and connect it to the hdfs I have setup on the machine
I am using JobConf as Configuration file and write the data in hdfs that I
want processed by mahout.
Here is the code I run on Eclipse with all necessary jar in the build path.
KMeansDriver km = new KMeansDriver();
km.setConf(conf);
Path output = new Path("output");
km.run(new String[] { "-i", "testdata/points/hiveDataPoints.txt", "-c",
"testdata/clusters", "-dm",
"org.apache.mahout.common.distance.EuclideanDistanceMeasure",
"-o", "output", "-x", "10", "-cl", "-cd", "0.001", "-ow" });
I get the f
following errors:
3/05/16 14:34:04 INFO common.AbstractJob: Command line arguments:
{--clustering=null, --clusters=[testdata/clusters],
--convergenceDelta=[0.001],
--distanceMeasure=[org.apache.mahout.common.distance.EuclideanDistanceMeasure],
--endPhase=[2147483647], --input=[testdata/points/hiveDataPoints.txt],
--maxIter=[10], --method=[mapreduce], --output=[output], --overwrite=null,
--startPhase=[0], --tempDir=[temp]}
13/05/16 14:34:05 INFO common.HadoopUtil: Deleting output
13/05/16 14:34:05 INFO kmeans.KMeansDriver: Input:
testdata/points/hiveDataPoints.txt Clusters In: testdata/clusters Out:
output Distance: org.apache.mahout.common.distance.EuclideanDistanceMeasure
13/05/16 14:34:05 INFO kmeans.KMeansDriver: convergence: 0.001 max
Iterations: 10 num Reduce Tasks: org.apache.mahout.math.VectorWritable
Input Vectors: {}
13/05/16 14:34:05 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Cluster Iterator running iteration 1 over priorPath: output/clusters-0
13/05/16 14:34:05 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
13/05/16 14:34:05 INFO input.FileInputFormat: Total input paths to process
: 1
13/05/16 14:34:05 INFO mapred.JobClient: Running job: job_201305161146_0046
13/05/16 14:34:06 INFO mapred.JobClient: map 0% reduce 0%
13/05/16 14:34:21 INFO mapred.JobClient: Task Id :
attempt_201305161146_0046_m_000000_0, Status : FAILED
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:217)
at
org.apache.mahout.clustering.iterator.CIMapper.setup(CIMapper.java:36)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/05/16 14:34:27 INFO mapred.JobClient: Task Id :
attempt_201305161146_0046_m_000000_1, Status : FAILED
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:217)
at
org.apache.mahout.clustering.iterator.CIMapper.setup(CIMapper.java:36)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/05/16 14:34:33 INFO mapred.JobClient: Task Id :
attempt_201305161146_0046_m_000000_2, Status : FAILED
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at
org.apache.mahout.clustering.classify.ClusterClassifier.readFromSeqFiles(ClusterClassifier.java:217)
at
org.apache.mahout.clustering.iterator.CIMapper.setup(CIMapper.java:36)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/05/16 14:34:45 INFO mapred.JobClient: Job complete: job_201305161146_0046
13/05/16 14:34:45 INFO mapred.JobClient: Counters: 7
13/05/16 14:34:45 INFO mapred.JobClient: Job Counters
13/05/16 14:34:45 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=25998
13/05/16 14:34:45 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/05/16 14:34:45 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
13/05/16 14:34:45 INFO mapred.JobClient: Launched map tasks=4
13/05/16 14:34:45 INFO mapred.JobClient: Data-local map tasks=4
13/05/16 14:34:45 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/05/16 14:34:45 INFO mapred.JobClient: Failed map tasks=1
Exception in thread "main" java.lang.InterruptedException: Cluster
Iteration 1 failed processing output/clusters-1
at
org.apache.mahout.clustering.iterator.ClusterIterator.iterateMR(ClusterIterator.java:186)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.buildClusters(KMeansDriver.java:230)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:150)
at
org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:109)
at com.cyril.mahout.KMeansClustering.main(KMeansClustering.java:93)
Best Regards and thank you in advance for an answer.