Hi,
What will be the input to RowSimilarityJob ?
When I passed tfidf-vectors files as input parameter
I got following error
Oct 29, 2010 2:21:35 PM org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.io.IntWritable
at
org.apache.mahout.math.hadoop.similarity.RowSimilarityJob$RowWeightMapper.ma
p(RowSimilarityJob.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Oct 29, 2010 2:21:36 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0001
Oct 29, 2010 2:21:36 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 0
Oct 29, 2010 2:21:36 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Oct 29, 2010 2:21:36 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: No job jar file set. User classes may not be found. See
JobConf(Class) or JobConf#setJar(String).
Oct 29, 2010 2:21:36 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 0
Oct 29, 2010 2:21:36 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0002
Oct 29, 2010 2:21:36 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 0
Oct 29, 2010 2:21:36 PM org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local_0002
java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:124)
Oct 29, 2010 2:21:37 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 0% reduce 0%
Oct 29, 2010 2:21:37 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0002
Oct 29, 2010 2:21:37 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 0
Oct 29, 2010 2:21:37 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Oct 29, 2010 2:21:38 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: No job jar file set. User classes may not be found. See
JobConf(Class) or JobConf#setJar(String).
Exception in thread "main"
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does
not exist: temp/pairwiseSimilarity
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFo
rmat.java:224)
at
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(Seq
uenceFileInputFormat.java:55)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFor
mat.java:241)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at
org.apache.mahout.math.hadoop.similarity.RowSimilarityJob.run(RowSimilarityJ
ob.java:174)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at
org.apache.mahout.math.hadoop.similarity.RowSimilarityJob.main(RowSimilarity
Job.java:86)
Its creating temp/weights directory but it is empty
and its not at all creating pairwiseSimilarity
so the other part of error I can figure it out..
but why java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be
cast to org.apache.hadoop.io.IntWritable
Unable to find out L
Wondering whether my input is correct or not ?
Regards,
Divya