[ 
https://issues.apache.org/jira/browse/MAHOUT-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881308#action_12881308
 ] 

Hudson commented on MAHOUT-423:
-------------------------------

Integrated in Mahout-Quality #93 (See 
[http://hudson.zones.apache.org/hudson/job/Mahout-Quality/93/])
    MAHOUT-423


> Optimize getNumUsersWithPreferenceFor(long... itemIDs)
> ------------------------------------------------------
>
>                 Key: MAHOUT-423
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-423
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jonathan Young
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.4
>
>         Attachments: MAHOUT-423.patch
>
>
> I ran a simple collaborative filtering application using a 
> GenericBooleanPrefDataModel built from (a subset of) the Netflix data, 
> Tanimoto similarity, and the GenericItemBasedRecommender, and then called 
> recommender.mostSimilarItems() (a lot).  
> Profiling indicated that the majority of the time was spent in 
> GenericBooleanPrefDataModel.getNumUsersWithPreferenceFor(long... itemIDs).  
> The version in GenericDataModel is optimized for the cases of one and two 
> itemIDs, but the version in GenericBooleanPrefDataModel always computes the 
> intersection set.
> I can create a patch which optimizes the two cases of itemIDs.length == 1 and 
> itemIDs.length == 2 (similar to the version in GenericDataModel), but perhaps 
> the code should be refactored if these are really the most common cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to