Hi all, I am trying to run the example from https://cwiki.apache.org/confluence/display/MAHOUT/Itembased+Collaborative+Filtering,
with the following command bin/mahout org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -Dmapred.input.dir=input -Dmapred.output.dir=output --itemsFile itemfile --tempDir tempDir The algorithm estimate the preference of a user towards an item which he/she has not yet seen. Once an algorithm can predict preferences it can also be used to do Top-N-Recommendation where the task is to find the N items a given user might like best. It is mentioned that given a DataModel, it can produce recommendations. The algorithm takes approx. 5 minutes to generate top 5 recommendations for one user on a 10 node hadoop cluster. The size of input is shortened only to 200 users from "1 Million MovieLens Dataset" from Grouplens.org. I have few questions: 1) I want to know that if it is possible to isolate the data model building step to generating recommendations. 2) Can we use the model once generated using the training data for generating recommendations for a range of users. 3) To be specific, if I want to provide an on-line service that generates recommendations for users, Can I minimize the cost of MapReduce interactions each time. I am not a data mining expert. Please help me to understand this in a better way. Thanks and Regards, Amit -- View this message in context: http://lucene.472066.n3.nabble.com/RecommenderJob-Mahout-Long-Response-Time-tp3335505p3335505.html Sent from the Hadoop lucene-dev mailing list archive at Nabble.com.