Great suggestion! Will do.
On Fri, Jan 25, 2013 at 1:10 AM, Sean Owen <[email protected]> wrote: > Why not test both the original and pruned data set? The low-rating > data may still help, even when the rating is forgotten. > I would not base the decision just on whether you can make > recommendations to N users but the quality of recommendations overall. > > In this particular data set, which is rich and un-noisy, the ratings > are probably valuable information and I imagine you will do better > with any approach that doesn't drop them. > > On Fri, Jan 25, 2013 at 2:19 AM, Koobas <[email protected]> wrote: > > They use a boolean recommender on the 10M MovieLens data > > with negative ratings removed (including only 3 stars or more). > > I wonder if this is a valid approach, as opposed to not removing > anything. > > > > I actually went through the exercise of removing negative ratings from > the > > 10M MovieLens set, > > and made the following observations: > > > > - It removes about 17% of all ratings, > > - 15 users disappear (out of 70,000), > > - 79 movies disappear (out of 10,000). > > > > So, it does not seem to hurt the overall exercise. > > Reasonably small fraction of ratings is gone. > > We will not recommend movies to a dozen users, who did not line anything. > > We will not be recommending movies which nobody liked. > > > > I would definitely appreciate some comments about that approach. >
