Hi All,I am having another issue with item similarity. For some reason 
numUsers.bin file does not get generated. I am copying the command  here:
./mahout itemsimilarity -i /scratch/SimilartyInput -o /scratch/SimilartyOutput 
--tempDir /scratch/Similartytemp -s SIMILARITY_COOCCURRENCE 
--maxSimilaritiesPerItem 10
The first MR job runs  and then at the end of it I see the following error:
13/12/27 12:56:57 INFO mapred.JobClient:  map 84% reduce 22%13/12/27 12:57:00 
INFO mapred.JobClient:  map 86% reduce 22%13/12/27 12:57:05 INFO 
mapred.JobClient: Job complete: job_201311111627_043813/12/27 12:57:05 INFO 
mapred.JobClient: Counters: 2413/12/27 12:57:05 INFO mapred.JobClient:   Job 
Counters13/12/27 12:57:05 INFO mapred.JobClient:     Launched reduce 
tasks=113/12/27 12:57:05 INFO mapred.JobClient:     
SLOTS_MILLIS_MAPS=31478113/12/27 12:57:05 INFO mapred.JobClient:     Total time 
spent by all reduces waiting after reserving slots (ms)=013/12/27 12:57:05 INFO 
mapred.JobClient:     Total time spent by all maps waiting after reserving 
slots (ms)=013/12/27 12:57:05 INFO mapred.JobClient:     Rack-local map 
tasks=1213/12/27 12:57:05 INFO mapred.JobClient:     Launched map 
tasks=6113/12/27 12:57:05 INFO mapred.JobClient:     Data-local map 
tasks=4913/12/27 12:57:05 INFO mapred.JobClient:     
SLOTS_MILLIS_REDUCES=2706113/12/27 12:57:05 INFO mapred.JobClient:     Failed 
map tasks=113/12/27 12:57:05 INFO mapred.JobClient:   
FileSystemCounters13/12/27 12:57:05 INFO mapred.JobClient:     
HDFS_BYTES_READ=1927958413/12/27 12:57:05 INFO mapred.JobClient:     
FILE_BYTES_WRITTEN=131048013/12/27 12:57:05 INFO mapred.JobClient:   File Input 
Format Counters13/12/27 12:57:05 INFO mapred.JobClient:     Bytes 
Read=1927253413/12/27 12:57:05 INFO mapred.JobClient:   Map-Reduce 
Framework13/12/27 12:57:05 INFO mapred.JobClient:     Map output materialized 
bytes=18969013/12/27 12:57:05 INFO mapred.JobClient:     Combine output 
records=4308113/12/27 12:57:05 INFO mapred.JobClient:     Map input 
records=129447813/12/27 12:57:05 INFO mapred.JobClient:     Physical memory 
(bytes) snapshot=1712499916813/12/27 12:57:05 INFO mapred.JobClient:     
Spilled Records=4308113/12/27 12:57:05 INFO mapred.JobClient:     Map output 
bytes=775525813/12/27 12:57:05 INFO mapred.JobClient:     CPU time spent 
(ms)=3754013/12/27 12:57:05 INFO mapred.JobClient:     Total committed heap 
usage (bytes)=1799616921613/12/27 12:57:05 INFO mapred.JobClient:     Virtual 
memory (bytes) snapshot=12981121024013/12/27 12:57:05 INFO mapred.JobClient:    
 Combine input records=129447813/12/27 12:57:05 INFO mapred.JobClient:     Map 
output records=129447813/12/27 12:57:05 INFO mapred.JobClient:     
SPLIT_RAW_BYTES=7050

Exception in thread "main" java.io.FileNotFoundException: File does not exist: 
/scratch/Similartytemp/prepareRatingMatrix/numUsers.bin  at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)   
     at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)  at 
org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)    at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)    at 
org.apache.mahout.common.HadoopUtil.readInt(HadoopUtil.java:339)     at 
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.run(ItemSimilarityJob.java:147)
  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)    at 
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)    at 
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob.main(ItemSimilarityJob.java:93)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)     at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)  
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)    at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)   
     at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)     at 
org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I checked the temp directory and here are its contents. I am not sure why the 
numUsers.bin file is not generated.
-bash-4.1$ hadoop dfs -ls /scratch/Similartytemp/Warning: $HADOOP_HOME is 
deprecated.
Found 1 itemsdrwxr-xr-x   - userid supergroup          0 2013-12-27 12:56 
/scratch/Similartytemp/prepareRatingMatrix                                      

Reply via email to