[
https://issues.apache.org/jira/browse/MAHOUT-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated MAHOUT-576:
-----------------------------
Due Date: 14/Jan/11
Description:
the JDBC version of the DiffStorage is not using a RunningAverage in the
removePreference case, and ends up making incorrect calculations.
In a scenario where users are setting and removing a lot of preferences, the
AVG stored in the diff table quickly diverges from the correct value because of
this.
Right now, the input to updateItemPref comes from SlopeOneRecommender, and in
the case of removePreference, it is *the old preference* value, not a delta.
However, the code uses it as if it were a delta. Thus the calculation is off by
PEER(removedpreference,userid)/count everytime a user removes a preference.
At first glance, the code should compute the old delta instead of the old
preference, and use this in the updateItemPref
was:
the JDBC version of the DiffStorage is not using a RunningAverage in the
removePreference case, and ends up making incorrect calculations.
In a scenario where users are setting and removing a lot of preferences, the
AVG stored in the diff table quickly diverges from the correct value because of
this.
Right now, the input to updateItemPref comes from SlopeOneRecommender, and in
the case of removePreference, it is *the old preference* value, not a delta.
However, the code uses it as if it were a delta. Thus the calculation is off by
PEER(removedpreference,userid)/count everytime a user removes a preference.
At first glance, the code should compute the old delta instead of the old
preference, and use this in the updateItemPref
Assignee: Sean Owen
Yep I'll get on this. You are right about the issue and the shape of the change.
> AbstractJDBCDiffStorage.updateItemPref is updating the AVG incorrectly in
> most cases
> ------------------------------------------------------------------------------------
>
> Key: MAHOUT-576
> URL: https://issues.apache.org/jira/browse/MAHOUT-576
> Project: Mahout
> Issue Type: Bug
> Components: Collaborative Filtering
> Affects Versions: 0.4
> Reporter: Renaud Bruyeron
> Assignee: Sean Owen
> Fix For: 0.5
>
>
> the JDBC version of the DiffStorage is not using a RunningAverage in the
> removePreference case, and ends up making incorrect calculations.
> In a scenario where users are setting and removing a lot of preferences, the
> AVG stored in the diff table quickly diverges from the correct value because
> of this.
> Right now, the input to updateItemPref comes from SlopeOneRecommender, and in
> the case of removePreference, it is *the old preference* value, not a delta.
> However, the code uses it as if it were a delta. Thus the calculation is off
> by PEER(removedpreference,userid)/count everytime a user removes a preference.
> At first glance, the code should compute the old delta instead of the old
> preference, and use this in the updateItemPref
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.