In RecommenderJob.java (org.apache.mahout.cf.taste.hadoop.item), what is the
primary purpose of the first map reduce job? This is the one that I am
talking about -

    if (shouldRunNextPhase(parsedArgs, currentPhase)) {
      Job itemIDIndex = prepareJob(
        inputPath, itemIDIndexPath, TextInputFormat.class,
        ItemIDIndexMapper.class, VarIntWritable.class,
VarLongWritable.class,
        ItemIDIndexReducer.class, VarIntWritable.class,
VarLongWritable.class,
        SequenceFileOutputFormat.class);
      itemIDIndex.setCombinerClass(ItemIDIndexReducer.class);
      itemIDIndex.waitForCompletion(true);
    }

It seems to me that the mapper just outputs int based keys for the item/user
long ids, and the reducer just finds the least user/item id within each
index. Do we just want to find the lowest id in our complete dataset, for
which we end up spinning a complete map reduce job?  
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/1st-MapReduce-job-in-RecommenderJob-java-org-apache-mahout-cf-taste-hadoop-item-tp1342081p1342081.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to