Sameer,
which version of Mahout are you using?
--sebastian
On 30.12.2013 19:39, Sameer Tilak wrote:
> Hi All,I am getting the following error while executing this job:
> -bash-4.1$ ./mahout itemsimilarity -i /scratch/SimilartyInput -o
> /scratch/SimilartyOutput -s SIMILARITY_COOCCURRENCE --maxSimilaritiesPerItem
> 10
> 13/12/30 10:30:29 INFO common.AbstractJob: Command line arguments:
> {--booleanData=[false], --endPhase=[2147483647],
> --input=[/scratch/SimilartyInput], --maxPrefs=[500],
> --maxSimilaritiesPerItem=[10], --minPrefsPerUser=[1],
> --output=[/scratch/SimilartyOutput],
> --similarityClassname=[SIMILARITY_COOCCURRENCE], --startPhase=[0],
> --tempDir=[temp]}13/12/30 10:30:29 INFO common.AbstractJob: Command line
> arguments: {--booleanData=[false], --endPhase=[2147483647],
> --input=[/scratch/SimilartyInput], --minPrefsPerUser=[1],
> --output=[temp/prepareRatingMatrix], --ratingShift=[0.0], --startPhase=[0],
> --tempDir=[temp]}13/12/30 10:30:30 INFO input.FileInputFormat: Total input
> paths to process : 513/12/30 10:30:30 INFO util.NativeCodeLoader: Loaded the
> native-hadoop library13/12/30 10:30:30 WARN snappy.LoadSnappy: Snappy native
> library not loaded13/12/30 10:30:30 INFO mapred.JobClient: Running job:
> job_201311111627_046813/12/30 10:30:31 INFO mapred.JobClient: map 0% reduce
> 0%
>
> 13/12/30 10:30:46 INFO mapred.JobClient: Task Id :
> attempt_201311111627_0468_m_000000_0, Status :
> FAILEDjava.lang.ArrayIndexOutOfBoundsException: 1 at
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:50)
> at
> org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at
> org.apache.hadoop.mapred.Child$4.run(Child.java:255) at
> java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:415) at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> The above error message shows up a number of times and then finally I get the
> following message.
> 13/12/30 10:31:19 INFO mapred.JobClient: Job complete:
> job_201311111627_046813/12/30 10:31:19 INFO mapred.JobClient: Counters:
> 813/12/30 10:31:19 INFO mapred.JobClient: Job Counters13/12/30 10:31:19
> INFO mapred.JobClient: SLOTS_MILLIS_MAPS=11521713/12/30 10:31:19 INFO
> mapred.JobClient: Total time spent by all reduces waiting after reserving
> slots (ms)=013/12/30 10:31:19 INFO mapred.JobClient: Total time spent by
> all maps waiting after reserving slots (ms)=013/12/30 10:31:19 INFO
> mapred.JobClient: Rack-local map tasks=1513/12/30 10:31:19 INFO
> mapred.JobClient: Launched map tasks=2013/12/30 10:31:19 INFO
> mapred.JobClient: Data-local map tasks=513/12/30 10:31:19 INFO
> mapred.JobClient: SLOTS_MILLIS_REDUCES=013/12/30 10:31:19 INFO
> mapred.JobClient: Failed map tasks=1
> Clearly, the preprocessing step that generates this file is not getting
> executed correctly.
> Exception in thread "main" java.io.FileNotFoundException: File does not
> exist: /user/p529444/temp/prepareRatingMatrix/numUsers.bin at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578) at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) at
> org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:339) at
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.run(ItemSimilarityJob.java:147)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.main(ItemSimilarityJob.java:93)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl
.
invoke(NativeMethodAccessorImpl.java:57) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601) at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601) at
org.apache.hadoop.util.RunJar.main(RunJar.java:156)-bash-4.1$ hadoop dfs -ls
temp/Warning: $HADOOP_HOME is deprecated.
> Found 1 itemsdrwxr-xr-x - p529444 supergroup 0 2013-12-30 10:30
> /user/p529444/temp/prepareRatingMatrix-bash-4.1$ hadoop dfs -ls
> /user/p529444/temp/prepareRatingMatrixWarning: $HADOOP_HOME is deprecated.
> Found 1 itemsdrwxr-xr-x - p529444 supergroup 0 2013-12-30 10:31
> /user/p529444/temp/prepareRatingMatrix/itemIDIndex
>
> In my admin browser, I see the following stats message:
> Job Setup: SuccessfulStatus: FailedFailure Info:# of failed Map Tasks
> exceeded allowed limit. FailedCount: 1. LastFailedTask:
> task_201311111627_0468_m_000000
>
> Any help with this would be great!
>
>