In RecommenderJob.java (org.apache.mahout.cf.taste.hadoop.item), what is the
primary purpose of the first map reduce job? This is the one that I am
talking about -
if (shouldRunNextPhase(parsedArgs, currentPhase)) {
Job itemIDIndex = prepareJob(
inputPath, itemIDIndexPath, TextInputFormat.class,
ItemIDIndexMapper.class, VarIntWritable.class,
VarLongWritable.class,
ItemIDIndexReducer.class, VarIntWritable.class,
VarLongWritable.class,
SequenceFileOutputFormat.class);
itemIDIndex.setCombinerClass(ItemIDIndexReducer.class);
itemIDIndex.waitForCompletion(true);
}
It seems to me that the mapper just outputs int based keys for the item/user
long ids, and the reducer just finds the least user/item id within each
index. Do we just want to find the lowest id in our complete dataset, for
which we end up spinning a complete map reduce job?
--
View this message in context:
http://lucene.472066.n3.nabble.com/1st-MapReduce-job-in-RecommenderJob-java-org-apache-mahout-cf-taste-hadoop-item-tp1342081p1342081.html
Sent from the Mahout User List mailing list archive at Nabble.com.