[ 
https://issues.apache.org/jira/browse/MAHOUT-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated MAHOUT-423:
-----------------------------

         Assignee: Sean Owen
    Fix Version/s: 0.4
         Priority: Minor  (was: Major)

> Optimize getNumUsersWithPreferenceFor(long... itemIDs)
> ------------------------------------------------------
>
>                 Key: MAHOUT-423
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-423
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.3
>            Reporter: Jonathan Young
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.4
>
>         Attachments: MAHOUT-423.patch
>
>
> I ran a simple collaborative filtering application using a 
> GenericBooleanPrefDataModel built from (a subset of) the Netflix data, 
> Tanimoto similarity, and the GenericItemBasedRecommender, and then called 
> recommender.mostSimilarItems() (a lot).  
> Profiling indicated that the majority of the time was spent in 
> GenericBooleanPrefDataModel.getNumUsersWithPreferenceFor(long... itemIDs).  
> The version in GenericDataModel is optimized for the cases of one and two 
> itemIDs, but the version in GenericBooleanPrefDataModel always computes the 
> intersection set.
> I can create a patch which optimizes the two cases of itemIDs.length == 1 and 
> itemIDs.length == 2 (similar to the version in GenericDataModel), but perhaps 
> the code should be refactored if these are really the most common cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to