Setting it to the maximum number should be enough. Would be great if you can share your dataset and tests.
2013/8/1 Rafal Lukawiecki <[email protected]> > Should I have set that parameter to a value much much larger than the > maximum number of actually expressed preferences by a user? > > I'm working on an anonymised data set. If it works as an error test case, > I'd be happy to share it for your re-test. I am still hoping it is my > error, not Mahout's. > > Rafal > -- > Rafal Lukawiecki > Pardon brevity, mobile device. > > On 1 Aug 2013, at 17:19, "Sebastian Schelter" <[email protected]> wrote: > > > Ok, please file a bug report detailing what you've tested and what > results > > you got. > > > > Just to clarify, setting maxPrefsPerUser to a high number still does not > > help? That surprises me. > > > > > > 2013/8/1 Rafal Lukawiecki <[email protected]> > > > >> Hi Sebastian, > >> > >> I've rechecked the results, and, I'm afraid that the issue has not gone > >> away, contrary to my yesterday's enthusiastic response. Using 0.8 I have > >> retested with and without --maxPrefsPerUser 9000 parameter (no user has > >> more than 5000 prefs). I have also supplied the prefs file, without the > >> preference value, that is as: user,item (one per line) as a > --filterFile, > >> with and without the -maxPrefsPerUser, and I am afraid we are also > seeing > >> recommendations for items the user has expressed a prior preference for. > >> > >> I suppose I need to file a bug report. > >> > >> Rafal > >> -- > >> Rafal Lukawiecki > >> Pardon my brevity, sent from a telephone. > >> > >> On 31 Jul 2013, at 22:35, "Rafal Lukawiecki" < > [email protected]> > >> wrote: > >> > >>> Dear Sebastian, > >>> > >>> It looks like setting --maxPrefsPerUser 10000 have resolved the issue > in > >> our case—it seems that the most preferences a user had was just about > 5000, > >> so I doubled it just-in-case, but when I operationalise this model, I > will > >> make sure to calculate the actual max number of preferences and set the > >> parameter accordingly. I will double-check the resultset to make sure > the > >> issue is really gone, as I have only checked the few cases where we have > >> spotted a recommendation of a previously preferred item. > >>> > >>> Would you like me to file a bug, and would you like me to test it on > 0.8 > >> or another version? I am using 0.7. > >>> > >>> Thanks for your kind support. > >>> Rafal > >>> -- > >>> Rafal Lukawiecki > >>> Strategic Consultant and Director > >>> Project Botticelli Ltd > >>> > >>> On 31 Jul 2013, at 06:22, Sebastian Schelter <[email protected]> > >>> wrote: > >>> > >>> Hi Rafal, > >>> > >>> can you try to set the option --maxPrefsPerUser to the maximum number > of > >>> interactions per user and see if you still get the error? > >>> > >>> Best, > >>> Sebastian > >>> > >>> On 30.07.2013 19:29, Rafal Lukawiecki wrote: > >>>> Thank you Sebastian. The data set is not that large, as we are running > >> tests on a subset. It is about 24k users, 40k items, the preference file > >> has 65k preferences as triples. This was using Similarity Cooccurrence. > >>>> > >>>> I can see if I could anonymise the data set to share if that would be > >> helpful. > >>>> > >>>> Thanks for your kind help. > >>>> > >>>> Rafal > >>>> -- > >>>> Rafal Lukawiecki > >>>> Pardon my brevity, sent from a telephone. > >>>> > >>>> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <[email protected]> > wrote: > >>>> > >>>>> Hi Rafal, > >>>>> > >>>>> can you issue a ticket for this problem at > >>>>> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests > that > >>>>> check whether this happens and currently they work fine. I can only > >> imagine > >>>>> that the problem occurs in larger datasets where we sample the data > in > >> some > >>>>> places. Can you describe a scenario/dataset where this happens? > >>>>> > >>>>> Best, > >>>>> Sebastian > >>>>> > >>>>> 2013/7/30 Rafal Lukawiecki <[email protected]> > >>>>> > >>>>>> I'm new here, just registered. Many thanks to everyone for working > on > >> an > >>>>>> amazing piece of software, thank you for building Mahout and for > your > >>>>>> support. My apologies if this is not the right place to ask the > >> question—I > >>>>>> have searched for the issue, and I can see this problem has been > >> reported > >>>>>> here: > >> > http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items > >>>>>> > >>>>>> Unfortunately, the trail leads to the newsgroups, and I have not > >> found a > >>>>>> way, yet, to get an answer from them, without asking you. > >>>>>> > >>>>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, > >> and I > >>>>>> am finding that it is recommending items that the user has already > >>>>>> expressed a preference for in their input file. I understand that > this > >>>>>> should not be happening, and I am not sure if there is a know fix or > >> if I > >>>>>> should be looking for a workaround (such as using the entire input > as > >> the > >>>>>> filterFile). > >>>>>> > >>>>>> I will double-check that there is no error on my side, but so far it > >> does > >>>>>> not seem that way. > >>>>>> > >>>>>> Many thanks and my regards from Ireland, > >>>>>> Rafal Lukawiecki > >>>>>> > >>>>>> -- > >>>>>> > >>>>>> Rafal Lukawiecki > >>>>>> > >>>>>> Strategic Consultant and Director > >>>>>> > >>>>>> Project Botticelli Ltd > >> >
