[ 
https://issues.apache.org/jira/browse/MAHOUT-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852850#action_12852850
 ] 

Hui Wen Han commented on MAHOUT-359:
------------------------------------

if it can optimize to avoid the multiplication and avoid to  
findTopNPrefsCutoff() for Non-existence preferences,
it will improve the performance.

also find issue is :

UserVectorToCooccurrenceMapper can has 2 map task maximumly (I need test more 
here )

> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob for Boolean 
> recommendation
> --------------------------------------------------------------------------------
>
>                 Key: MAHOUT-359
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-359
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Hui Wen Han
>
> in some case there has no preference value in the input data ,the preference 
> value is set to zero,then 
> RecommenderMapper.class
>  @Override
>   public void map(LongWritable userID,
>                   VectorWritable vectorWritable,
>                   OutputCollector<LongWritable,RecommendedItemsWritable> 
> output,
>                   Reporter reporter) throws IOException {
>     
>     if ((usersToRecommendFor != null) && 
> !usersToRecommendFor.contains(userID.get())) {
>       return;
>     }
>     Vector userVector = vectorWritable.get();
>     Iterator<Vector.Element> userVectorIterator = userVector.iterateNonZero();
>     Vector recommendationVector = new 
> RandomAccessSparseVector(Integer.MAX_VALUE, 1000);
>     while (userVectorIterator.hasNext()) {
>       Vector.Element element = userVectorIterator.next();
>       int index = element.index();
>       double value = element.get();     //here will get 0.0 for Boolean 
> recommendation 
>       Vector columnVector;
>       try {
>         columnVector = cooccurrenceColumnCache.get(new IntWritable(index));
>       } catch (TasteException te) {
>         if (te.getCause() instanceof IOException) {
>           throw (IOException) te.getCause();
>         } else {
>           throw new IOException(te.getCause());
>         }
>       }
>       if (columnVector != null) {
>         columnVector.times(value).addTo(recommendationVector); //here will 
> set all score value to zero for Boolean recommendation
>       }
>     }
>     
>     Queue<RecommendedItem> topItems = new 
> PriorityQueue<RecommendedItem>(recommendationsPerUser + 1,
>         Collections.reverseOrder());
>     
>     Iterator<Vector.Element> recommendationVectorIterator = 
> recommendationVector.iterateNonZero();
>     LongWritable itemID = new LongWritable();
>     while (recommendationVectorIterator.hasNext()) {
>       Vector.Element element = recommendationVectorIterator.next();
>       int index = element.index();
>       if (userVector.get(index) == 0.0) {
>         if (topItems.size() < recommendationsPerUser) {
>           indexItemIDMap.get(new IntWritable(index), itemID);
>           topItems.add(new GenericRecommendedItem(itemID.get(), (float) 
> element.get()));
>         } else if (element.get() > topItems.peek().getValue()) {
>           indexItemIDMap.get(new IntWritable(index), itemID);
>           topItems.add(new GenericRecommendedItem(itemID.get(), (float) 
> element.get()));
>           topItems.poll();
>         }
>       }
>     }
>     
>     List<RecommendedItem> recommendations = new 
> ArrayList<RecommendedItem>(topItems.size());
>     recommendations.addAll(topItems);
>     Collections.sort(recommendations);
>     output.collect(userID, new RecommendedItemsWritable(recommendations));
>   }
> so maybe we need a option to distinguish boolean recommendation and slope one 
> recommendation.
> in ToUserVectorReducer.class
> here no need findTopNPrefsCutoff,maybe take all item.
> it's just my thinking ,maybe item is used for slope one only .
> :)
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to