mahout itemsimilarity problem

Sameer Tilak Mon, 30 Dec 2013 10:40:49 -0800

Hi All,I am getting the following error while executing this job:
-bash-4.1$ ./mahout itemsimilarity -i /scratch/SimilartyInput -o 
/scratch/SimilartyOutput -s SIMILARITY_COOCCURRENCE --maxSimilaritiesPerItem 10
13/12/30 10:30:29 INFO common.AbstractJob: Command line arguments: 
{--booleanData=[false], --endPhase=[2147483647], 
--input=[/scratch/SimilartyInput], --maxPrefs=[500], 
--maxSimilaritiesPerItem=[10], --minPrefsPerUser=[1], 
--output=[/scratch/SimilartyOutput], 
--similarityClassname=[SIMILARITY_COOCCURRENCE], --startPhase=[0], 
--tempDir=[temp]}13/12/30 10:30:29 INFO common.AbstractJob: Command line 
arguments: {--booleanData=[false], --endPhase=[2147483647], 
--input=[/scratch/SimilartyInput], --minPrefsPerUser=[1], 
--output=[temp/prepareRatingMatrix], --ratingShift=[0.0], --startPhase=[0], 
--tempDir=[temp]}13/12/30 10:30:30 INFO input.FileInputFormat: Total input 
paths to process : 513/12/30 10:30:30 INFO util.NativeCodeLoader: Loaded the 
native-hadoop library13/12/30 10:30:30 WARN snappy.LoadSnappy: Snappy native 
library not loaded13/12/30 10:30:30 INFO mapred.JobClient: Running job: 
job_201311111627_046813/12/30 10:30:31 INFO mapred.JobClient:  map 0% reduce 0%


13/12/30 10:30:46 INFO mapred.JobClient: Task Id : 
attempt_201311111627_0468_m_000000_0, Status : 
FAILEDjava.lang.ArrayIndexOutOfBoundsException: 1     at 
org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:50)
      at 
org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)      at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)      at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)       at 
org.apache.hadoop.mapred.Child$4.run(Child.java:255) at 
java.security.AccessController.doPrivileged(Native Method)   at 
javax.security.auth.Subject.doAs(Subject.java:415)   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
The above error message shows up a number of times and then finally I get the 
following message. 
13/12/30 10:31:19 INFO mapred.JobClient: Job complete: 
job_201311111627_046813/12/30 10:31:19 INFO mapred.JobClient: Counters: 
813/12/30 10:31:19 INFO mapred.JobClient:   Job Counters13/12/30 10:31:19 INFO 
mapred.JobClient:     SLOTS_MILLIS_MAPS=11521713/12/30 10:31:19 INFO 
mapred.JobClient:     Total time spent by all reduces waiting after reserving 
slots (ms)=013/12/30 10:31:19 INFO mapred.JobClient:     Total time spent by 
all maps waiting after reserving slots (ms)=013/12/30 10:31:19 INFO 
mapred.JobClient:     Rack-local map tasks=1513/12/30 10:31:19 INFO 
mapred.JobClient:     Launched map tasks=2013/12/30 10:31:19 INFO 
mapred.JobClient:     Data-local map tasks=513/12/30 10:31:19 INFO 
mapred.JobClient:     SLOTS_MILLIS_REDUCES=013/12/30 10:31:19 INFO 
mapred.JobClient:     Failed map tasks=1
Clearly, the preprocessing step that generates this file is not getting 
executed correctly. 
Exception in thread "main" java.io.FileNotFoundException: File does not exist: 
/user/p529444/temp/prepareRatingMatrix/numUsers.bin      at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)   
     at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)  at 
org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)    at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)    at 
org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:339)     at 
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.run(ItemSimilarityJob.java:147)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)    at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)    at 
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.main(ItemSimilarityJob.java:93)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)     at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)  
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)    at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)     at 
org.apache.hadoop.util.RunJar.main(RunJar.java:156)-bash-4.1$ hadoop dfs -ls 
temp/Warning: $HADOOP_HOME is deprecated.
Found 1 itemsdrwxr-xr-x   - p529444 supergroup          0 2013-12-30 10:30 
/user/p529444/temp/prepareRatingMatrix-bash-4.1$ hadoop dfs -ls 
/user/p529444/temp/prepareRatingMatrixWarning: $HADOOP_HOME is deprecated.
Found 1 itemsdrwxr-xr-x   - p529444 supergroup          0 2013-12-30 10:31 
/user/p529444/temp/prepareRatingMatrix/itemIDIndex

In my admin  browser, I see the following stats message:
Job Setup: SuccessfulStatus: FailedFailure Info:# of failed Map Tasks exceeded 
allowed limit. FailedCount: 1. LastFailedTask: task_201311111627_0468_m_000000

Any help with this would be great!

mahout itemsimilarity problem

Reply via email to