[ 
https://issues.apache.org/jira/browse/MAHOUT-667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034705#comment-13034705
 ] 

Chris Newell commented on MAHOUT-667:
-------------------------------------

Found a bug in AbstractFactorizer, which I introduced after failing to 
understand how FastByIDMap behaves.

These two methods:

{code} 
  protected Integer userIndex(long userID) {
    Integer userIndex = userIDMapping.get(userID);
    if (userIndex == null) {
      userIndex = userIDMapping.put(userID, userIDMapping.size());
    }
    return userIndex;
  }

  protected Integer itemIndex(long itemID) {
    Integer itemIndex = itemIDMapping.get(itemID);
    if (itemIndex == null) {
      itemIndex = itemIDMapping.put(itemID, itemIDMapping.size());
    }
    return itemIndex;
  }
{code} 

Should be replaced by:

{code}
  protected Integer getUserIndex(long userID) {
    Integer userIndex = userIDMapping.get(userID);
    if (userIndex == null) {
      userIndex = userIDMapping.size();
      userIDMapping.put(userID, userIndex);
    }
    return userIndex;
  }

  protected Integer getItemIndex(long itemID) {
    Integer itemIndex = itemIDMapping.get(itemID);
    if (itemIndex == null) {
      itemIndex = itemIDMapping.size();
      itemIDMapping.put(itemID, itemIndex);
    }
    return itemIndex;
  }
{code}

> Persistent storage of factorizations in SVDRecommender
> ------------------------------------------------------
>
>                 Key: MAHOUT-667
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-667
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Chris Newell
>            Assignee: Sebastian Schelter
>            Priority: Minor
>             Fix For: 0.5
>
>         Attachments: persistent_svd.patch, persistent_svd_v2.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> As discussed previously (https://issues.apache.org/jira/browse/MAHOUT-640) it 
> would be beneficial to provide a persistent storage mechanism for 
> factorizations created by SVDRecommender (in package 
> org.apache.mahout.cf.taste.impl.recommender.svd) as these can be time 
> consuming to produce. It would also allow factorizations to be computed on 
> one machine then distributed to other machines providing predictions, 
> improving efficiency and scalability.
> Having a "persistence strategy" interface has been suggested that could be 
> implemented as required. I'll try to post a outline proposal for discussion 
> purposes in the next few days but any comments or suggestions would be very 
> welcome.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to