That may have been it.. I'm not sure though. This command seems to work:

hadoop jar mahout-core-0.5-SNAPSHOT-jar-with-dependencies.jar
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
--similarityClassname SIMILARITY_LOGLIKELIHOOD --tempDir
/user/mruno/temp -o /user/mruno/output -i /user/mruno/input

I hope this helps anyone who's trying to run this stuff..

--Matthew

On Thu, Feb 24, 2011 at 11:46 AM, Sebastian Schelter <[email protected]> wrote:
> Hi Matthew,
>
> I can't really see what's wrong, only thing that makes me wonder is
> that your input and output dir are the same, you sure that's right?
>
> --sebastian
>
> 2011/2/24 Matthew Runo <[email protected]>:
>> Hello folks -
>>
>> I made an attempt at running the ItemSimilarityJob on our hadoop
>> cluster today, but I can't seem to get past this error:
>>
>> 11/02/24 09:18:17 INFO mapred.JobClient: Task Id :
>> attempt_201102231433_0008_m_000070_0, Status : FAILED
>> java.io.FileNotFoundException: File does not exist: /user/mruno/temp
>>        at 
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1586)
>>        at 
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1577)
>>        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:428)
>>        at 
>> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187)
>>        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456)
>>        at 
>> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:67)
>>        at 
>> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:450)
>>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>>        at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>>        at java.security.AccessController.doPrivileged(Native Method)
>>        at javax.security.auth.Subject.doAs(Subject.java:396)
>>        at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>        at org.apache.hadoop.mapred.Child.main(Child.java:234)
>>
>> I ran the job with this command:
>>
>> hadoop jar mahout-core-0.5-SNAPSHOT-jar-with-dependencies.jar
>> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
>> -Dmapred.input.dir=/user/mruno -Dmapred.output.dir=/user/mruno
>> --similarityClassname SIMILARITY_LOGLIKELIHOOD --tempDir
>> /user/mruno/temp
>>
>> The "first" job runs fine, but the second one it spawns after that always 
>> fails:
>> ItemSimilarityJob-ItemIDIndexMapper-ItemIDIndexReducer (runs fine)
>> ItemSimilarityJob-CountUsersMapper-CountUsersReducer (fails with above error)
>> ItemSimilarityJob-ToItemPrefsMapper-ToUserVectorReducer (fails with above 
>> error)
>>
>> If I look in HDFS, I have the following directory structure:
>> /user/mruno/input-data-file.csv
>> /user/mruno/temp/countUsers/...
>> /user/mruno/temp/itemIDIndex/...
>> /user/mruno/temp//userVectors/...
>>
>> ...so obviously the path I gave for --tempDir exists and is writable,
>> after all the job created all that stuff just fine except for the
>> input file.
>>
>> Does anyone have an idea on this? I'm sort of lost as to where to
>> start, the exception isn't all that helpful.
>>
>> If I look at the job's XML file, I see that it has mapred.output.dir
>> set to /user/mruno/temp/userVectors, which does exist there.
>>
>> I'd appreciate any ideas, and I apologize if this would be better
>> asked on the Hadoop message list but I thought I'd try here first
>> since it was specific to the ItemSimilarityJob.
>>
>

Reply via email to