Hi Matthew,

I can't really see what's wrong, only thing that makes me wonder is
that your input and output dir are the same, you sure that's right?

--sebastian

2011/2/24 Matthew Runo <[email protected]>:
> Hello folks -
>
> I made an attempt at running the ItemSimilarityJob on our hadoop
> cluster today, but I can't seem to get past this error:
>
> 11/02/24 09:18:17 INFO mapred.JobClient: Task Id :
> attempt_201102231433_0008_m_000070_0, Status : FAILED
> java.io.FileNotFoundException: File does not exist: /user/mruno/temp
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1586)
>        at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1577)
>        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:428)
>        at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
>        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>        at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67)
>        at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:450)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at javax.security.auth.Subject.doAs(Subject.java:396)
>        at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>        at org.apache.hadoop.mapred.Child.main(Child.java:234)
>
> I ran the job with this command:
>
> hadoop jar mahout-core-0.5-SNAPSHOT-jar-with-dependencies.jar
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
> -Dmapred.input.dir=/user/mruno -Dmapred.output.dir=/user/mruno
> --similarityClassname SIMILARITY_LOGLIKELIHOOD --tempDir
> /user/mruno/temp
>
> The "first" job runs fine, but the second one it spawns after that always 
> fails:
> ItemSimilarityJob-ItemIDIndexMapper-ItemIDIndexReducer (runs fine)
> ItemSimilarityJob-CountUsersMapper-CountUsersReducer (fails with above error)
> ItemSimilarityJob-ToItemPrefsMapper-ToUserVectorReducer (fails with above 
> error)
>
> If I look in HDFS, I have the following directory structure:
> /user/mruno/input-data-file.csv
> /user/mruno/temp/countUsers/...
> /user/mruno/temp/itemIDIndex/...
> /user/mruno/temp//userVectors/...
>
> ...so obviously the path I gave for --tempDir exists and is writable,
> after all the job created all that stuff just fine except for the
> input file.
>
> Does anyone have an idea on this? I'm sort of lost as to where to
> start, the exception isn't all that helpful.
>
> If I look at the job's XML file, I see that it has mapred.output.dir
> set to /user/mruno/temp/userVectors, which does exist there.
>
> I'd appreciate any ideas, and I apologize if this would be better
> asked on the Hadoop message list but I thought I'd try here first
> since it was specific to the ItemSimilarityJob.
>

Reply via email to