[ 
https://issues.apache.org/jira/browse/MAHOUT-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977769#action_12977769
 ] 

Renaud Bruyeron commented on MAHOUT-576:
----------------------------------------


I am looking for a way to fix this, and it seems that something is missing in 
the DiffStorage API:
{code}
  /**
   * <p>
   * Updates internal data structures to reflect an update in a preference 
value for an item.
   * </p>
   * 
   * @param itemID
   *          item to update preference value for
   * @param prefDelta
   *          amount by which preference value changed (or its old value, if 
being removed
   * @param remove
   *          if <code>true</code>, operation reflects a removal rather than 
change of preference
   */
  void updateItemPref(long itemID, float prefDelta, boolean remove) throws 
TasteException;
{code}

this works when we have a true update (i.e. neither a removal nor a *new* 
preference).
However in the case of a removal or a new preference, this method is not 
enough: the implementations actually need to have the delta with the peers and 
not just the value being removed.
i.e.:
if user X removes preference Pa on item A, we need Pb of all items B that 
impacted by this removal (because in fine we need Pb-Pa in the calculation, and 
not just Pa)

I suspect an API change is needed here: split the method in 2 like this:
{code}
// this must be used only for true update, will throw TE if used on a removal 
or insertion
void updateItemPref(long itemID, float prefDelta) throws TasteException;

// this must be used for removal or insertion
void removeItemPref(long itemID, long userID, float pref, boolean removal) 
throws TasteException;
{code}

userID is needed to efficiently get at the peer preferences and compute deltas, 
I reckon. What do you think?

> AbstractJDBCDiffStorage.updateItemPref is updating the AVG incorrectly in 
> most cases
> ------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-576
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-576
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Renaud Bruyeron
>             Fix For: 0.5
>
>
> the JDBC version of the DiffStorage is not using a RunningAverage in the 
> removePreference case, and ends up making incorrect calculations.
> In a scenario where users are setting and removing a lot of preferences, the 
> AVG stored in the diff table quickly diverges from the correct value because 
> of this.
> Right now, the input to updateItemPref comes from SlopeOneRecommender, and in 
> the case of removePreference, it is *the old preference* value, not a delta. 
> However, the code uses it as if it were a delta. Thus the calculation is off 
> by PEER(removedpreference,userid)/count everytime a user removes a preference.
> At first glance, the code should compute the old delta instead of the old 
> preference, and use this in the updateItemPref 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to