Ok, please file a bug report detailing what you've tested and what results you got.
Just to clarify, setting maxPrefsPerUser to a high number still does not help? That surprises me. 2013/8/1 Rafal Lukawiecki <[email protected]> > Hi Sebastian, > > I've rechecked the results, and, I'm afraid that the issue has not gone > away, contrary to my yesterday's enthusiastic response. Using 0.8 I have > retested with and without --maxPrefsPerUser 9000 parameter (no user has > more than 5000 prefs). I have also supplied the prefs file, without the > preference value, that is as: user,item (one per line) as a --filterFile, > with and without the -maxPrefsPerUser, and I am afraid we are also seeing > recommendations for items the user has expressed a prior preference for. > > I suppose I need to file a bug report. > > Rafal > -- > Rafal Lukawiecki > Pardon my brevity, sent from a telephone. > > On 31 Jul 2013, at 22:35, "Rafal Lukawiecki" <[email protected]> > wrote: > > > Dear Sebastian, > > > > It looks like setting --maxPrefsPerUser 10000 have resolved the issue in > our case—it seems that the most preferences a user had was just about 5000, > so I doubled it just-in-case, but when I operationalise this model, I will > make sure to calculate the actual max number of preferences and set the > parameter accordingly. I will double-check the resultset to make sure the > issue is really gone, as I have only checked the few cases where we have > spotted a recommendation of a previously preferred item. > > > > Would you like me to file a bug, and would you like me to test it on 0.8 > or another version? I am using 0.7. > > > > Thanks for your kind support. > > Rafal > > -- > > Rafal Lukawiecki > > Strategic Consultant and Director > > Project Botticelli Ltd > > > > On 31 Jul 2013, at 06:22, Sebastian Schelter <[email protected]> > > wrote: > > > > Hi Rafal, > > > > can you try to set the option --maxPrefsPerUser to the maximum number of > > interactions per user and see if you still get the error? > > > > Best, > > Sebastian > > > > On 30.07.2013 19:29, Rafal Lukawiecki wrote: > >> Thank you Sebastian. The data set is not that large, as we are running > tests on a subset. It is about 24k users, 40k items, the preference file > has 65k preferences as triples. This was using Similarity Cooccurrence. > >> > >> I can see if I could anonymise the data set to share if that would be > helpful. > >> > >> Thanks for your kind help. > >> > >> Rafal > >> -- > >> Rafal Lukawiecki > >> Pardon my brevity, sent from a telephone. > >> > >> On 30 Jul 2013, at 18:18, "Sebastian Schelter" <[email protected]> wrote: > >> > >>> Hi Rafal, > >>> > >>> can you issue a ticket for this problem at > >>> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that > >>> check whether this happens and currently they work fine. I can only > imagine > >>> that the problem occurs in larger datasets where we sample the data in > some > >>> places. Can you describe a scenario/dataset where this happens? > >>> > >>> Best, > >>> Sebastian > >>> > >>> 2013/7/30 Rafal Lukawiecki <[email protected]> > >>> > >>>> I'm new here, just registered. Many thanks to everyone for working on > an > >>>> amazing piece of software, thank you for building Mahout and for your > >>>> support. My apologies if this is not the right place to ask the > question—I > >>>> have searched for the issue, and I can see this problem has been > reported > >>>> here: > >>>> > http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items > >>>> > >>>> Unfortunately, the trail leads to the newsgroups, and I have not > found a > >>>> way, yet, to get an answer from them, without asking you. > >>>> > >>>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, > and I > >>>> am finding that it is recommending items that the user has already > >>>> expressed a preference for in their input file. I understand that this > >>>> should not be happening, and I am not sure if there is a know fix or > if I > >>>> should be looking for a workaround (such as using the entire input as > the > >>>> filterFile). > >>>> > >>>> I will double-check that there is no error on my side, but so far it > does > >>>> not seem that way. > >>>> > >>>> Many thanks and my regards from Ireland, > >>>> Rafal Lukawiecki > >>>> > >>>> -- > >>>> > >>>> Rafal Lukawiecki > >>>> > >>>> Strategic Consultant and Director > >>>> > >>>> Project Botticelli Ltd > > > > > > >
