I would second all of what Pat said. I would add that off-line evaluation of recommenders is pretty tricky because, in practice, recommenders generate their own training data. This means that off-line evaluations or even performance on the first day is not the entire story.
On Sun, May 4, 2014 at 6:16 PM, Pat Ferrel <[email protected]> wrote: > First, are you doing an offline precision test? With training set and > probe or test set? > > You can remove some data from the dataset. So remove certain preferences. > Then train and obtain recommendations for the user’s who had some data > withheld. The test data has not been used to train and get recs so you then > compare what users’ actually preferred to the prediction made by the > recommender. If all of them match you have 100% precision. Note that you > are comparing recommendations to actual but held-out preferences > > If you are using some special tools you may be doing this to compare > algorithms, which is not an exact thing at all, no matter what the Netflix > prize may have led us to believe. If you are using offline tests to tune a > specific recommender you may have better luck with the results. > > In one installation we had real data and split it into test and training > by date. 90% of older data was used to train, the most recent 10% was used > to test. This would mimic the way data comes in. We compared the > recommendations from the training data against the actual preferences in > the help-out data and used MAP@some-number-of-recs as the score. This > allows you to measure ranking, where RMSE does not. The Map score led us to > several useful conclusions about tuning that were data dependent. > > http://en.wikipedia.org/wiki/Information_retrieval#Mean_average_precision > > On May 4, 2014, at 12:17 AM, Alessandro Suglia < > [email protected]> wrote: > > Unfortunately it is not what I need because I'm using a supplementary tool > in order to compute the metrics, so I simply need to produce a list of > recommendation according to some estimated preference that I have to > compute for a specific user and for specific items (the items in the test > set). > > How does it possible that Mahout doesn't grant this possibility? Am I > doing something wrong? > On 05/04/14 01:20, Pat Ferrel wrote: > > Are you doing this as an offline performance test? There is a test > framework for the in-memory recommenders (non-hadoop) that will hold out > random preferences and then use the held out ones to preform various > quality metrics. Is this what you need? > > > > See this wiki page under Evaluation > https://mahout.apache.org/users/recommender/userbased-5-minutes.html > > On May 3, 2014, at 3:46 PM, Alessandro Suglia < > [email protected]> wrote: > > > > This is the procedure that I've adopted in the first moment > (incorrectly). > > But what I need to do is to estimate the preference for items that > aren't in the training set. In particular I'm working with the movielens > 10k's so for each split I should train my recommender on the training set > and test them (using some classification metrics) on the test set. > > I'm not using the default mahout's evalutator so I need to predict the > preference and after that put all the results in a specific file. > > Can you make an example in which I can appropriately follow this way? > > > > Thank you in advance. > > Alessandro Suglia > > > > Il 03/mag/2014 23:06 Pat Ferrel <[email protected]> ha scritto: > >> Actually the regular cooccurrence recommender should work too. Your > example on Stackoverflow is calling the wrong method to get recs, call > .recommend(uersId) to get an ordered list of ids with strengths. > >> > >> It looks to me like you are getting preference data from the user, > which in this case is 1 or 0—not recommendations. > >> > >> On May 3, 2014, at 7:42 AM, Sebastian Schelter <[email protected]> wrote: > >> > >> You should try the > >> > >> > org.apache.mahout.cf.taste.impl.recommender.GenericBooleanPrefUserBasedRecommender > >> > >> which has been built to handle such data. > >> > >> Best, > >> Sebastian > >> > >> > >> On 05/03/2014 04:34 PM, Alessandro Suglia wrote: > >>> I have described it in the SO's post: > >>> "When I execute this code, the result is a list of 0.0 or 1.0 which are > >>> not useful in the context of top-n recommendation in implicit feedback > >>> context. Simply because I have to obtain, for each item, an estimated > >>> rate which stays in the range [0, 1] in order to rank the list in > >>> decreasing order and construct the top-n recommendation appropriately." > >>> On 05/03/14 16:25, Sebastian Schelter wrote: > >>>> Hi Allessandro, > >>>> > >>>> what result do you expect and what do you get? Can you give a concrete > >>>> example? > >>>> > >>>> --sebastian > >>>> > >>>> On 05/03/2014 12:11 PM, Alessandro Suglia wrote: > >>>>> Good morning, > >>>>> I've tried to create a recommender system using Mahout in an implicit > >>>>> feedback situation. What I'm trying to do is explained exactlly in > this > >>>>> post on stack overflow: > >>>>> > http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation > . > >>>>> > >>>>> < > http://stackoverflow.com/questions/23077735/mahout-recommendation-in-implicit-feedback-situation > > > >>>>> > >>>>> > >>>>> As you can see, I'm having some problem with it simply because I > cannot > >>>>> get the result that I expect (a value between 0 and 1) when I try to > >>>>> predict a score for a specific item. > >>>>> > >>>>> Someone here can help me, please? > >>>>> > >>>>> Thank you in advance. > >>>>> > >>>>> Alessandro Suglia > >>>>> > >> > > >
