Hi Matthew, I can't really see what's wrong, only thing that makes me wonder is that your input and output dir are the same, you sure that's right?
--sebastian 2011/2/24 Matthew Runo <[email protected]>: > Hello folks - > > I made an attempt at running the ItemSimilarityJob on our hadoop > cluster today, but I can't seem to get past this error: > > 11/02/24 09:18:17 INFO mapred.JobClient: Task Id : > attempt_201102231433_0008_m_000070_0, Status : FAILED > java.io.FileNotFoundException: File does not exist: /user/mruno/temp > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1586) > at > org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1577) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:428) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) > at > org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:450) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) > at org.apache.hadoop.mapred.Child$4.run(Child.java:240) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) > at org.apache.hadoop.mapred.Child.main(Child.java:234) > > I ran the job with this command: > > hadoop jar mahout-core-0.5-SNAPSHOT-jar-with-dependencies.jar > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob > -Dmapred.input.dir=/user/mruno -Dmapred.output.dir=/user/mruno > --similarityClassname SIMILARITY_LOGLIKELIHOOD --tempDir > /user/mruno/temp > > The "first" job runs fine, but the second one it spawns after that always > fails: > ItemSimilarityJob-ItemIDIndexMapper-ItemIDIndexReducer (runs fine) > ItemSimilarityJob-CountUsersMapper-CountUsersReducer (fails with above error) > ItemSimilarityJob-ToItemPrefsMapper-ToUserVectorReducer (fails with above > error) > > If I look in HDFS, I have the following directory structure: > /user/mruno/input-data-file.csv > /user/mruno/temp/countUsers/... > /user/mruno/temp/itemIDIndex/... > /user/mruno/temp//userVectors/... > > ...so obviously the path I gave for --tempDir exists and is writable, > after all the job created all that stuff just fine except for the > input file. > > Does anyone have an idea on this? I'm sort of lost as to where to > start, the exception isn't all that helpful. > > If I look at the job's XML file, I see that it has mapred.output.dir > set to /user/mruno/temp/userVectors, which does exist there. > > I'd appreciate any ideas, and I apologize if this would be better > asked on the Hadoop message list but I thought I'd try here first > since it was specific to the ItemSimilarityJob. >
