This may or may not help much. My guess is that the improvement will be very modest.
The most serious problem is going to be recommendations for anybody who has rated one of these excessively popular items. That item will bring in a huge number of other users and thus a huge number of items to consider. If you down-sample ratings of the prolific users and kill super-common items, I think you will see much more improvement than simply eliminating the singleton users. The basic issue is that cooccurrence based algorithms have run-time proportional to O(n_max^2) where n_max is the maximum number of items per user. On Thu, Dec 1, 2011 at 2:35 PM, Daniel Zohar <[email protected]> wrote: > This is why I'm looking now into improving GenericBooleanPrefDataModel to > not take into account users which made one interaction under the > 'preferenceForItems' Map. What do you think about this approach? >
