Hi, I applied the changes of MAHOUT-553 (thanks Sebastian!) against mahout-0.4. Everything makes sense now. I've tried it with different similarities (SIMILARITY_LOGLIKELIHOOD, SIMILARITY_TANIMOTO_COEFFICIENT, SIMILARITY_UNCENTERED_COSINE) and it works fine (i.e. I got good recommendations with different scores) but when I tried SIMILARITY_PEARSON_CORRELATION, I got an empty part-00000 file. Is it normal?
On Fri, Nov 26, 2010 at 7:50 PM, Sean Owen <[email protected]> wrote: > The behavior difference is fairly simple. Instead of a weighted > average of preferences (which will always equal 1.0), compute some > other function of those weights -- for example, the average of the > weights. > > See GenericBooleanPrefItemBasedRecommender. It's actually just summing > the weights. This is nearly the same thing since the number of items > participating in the average is the same for all estimates. *Nearly* > the same since some can be NaN. > > It's an open question whether there aren't better functions of the > weights to use, but this is a fine start, IMHO. > > > On Fri, Nov 26, 2010 at 6:45 PM, Sebastian Schelter <[email protected]> > wrote: > > Hi Sean, > > > > the prediction computation for boolean data is done in > > AggregateAndRecommendReducer.reduceBooleanData() > > > > It computes *all* possible items to recommend for the current user and > > writes out only the n first after that, with n being the number > > specified in the parameter --numRecommendations given to RecommenderJob. > > > > Can you point me to the code where the non-distributed code handles the > > problem of ranking them? We could certainly emulate that behaviour in > > the distributed code too. > > > > --sebastian > > > > > > > > Am 26.11.2010 19:35, schrieb Sean Owen: > >> But is it then ranking the recommendations by the estimated pref? If > >> it's always 1, then the ordering is not meaningful. > >> > >> Maybe it is, I just haven't looked at your changes in much detail > >> since you made them although it looked broadly correct and proper. > >> > >> On Fri, Nov 26, 2010 at 6:33 PM, Sebastian Schelter <[email protected]> > wrote: > >> > >>> If all ratings have value 1 (cause we use boolean data) the result of > >>> the Predicition can also only be 1. > >>> > > > > >
