Hi Rafal, can you try to set the option --maxPrefsPerUser to the maximum number of interactions per user and see if you still get the error?
Best, Sebastian On 30.07.2013 19:29, Rafal Lukawiecki wrote: > Thank you Sebastian. The data set is not that large, as we are running tests > on a subset. It is about 24k users, 40k items, the preference file has 65k > preferences as triples. This was using Similarity Cooccurrence. > > I can see if I could anonymise the data set to share if that would be helpful. > > Thanks for your kind help. > > Rafal > -- > Rafal Lukawiecki > Pardon my brevity, sent from a telephone. > > On 30 Jul 2013, at 18:18, "Sebastian Schelter" <[email protected]> wrote: > >> Hi Rafal, >> >> can you issue a ticket for this problem at >> https://issues.apache.org/jira/browse/MAHOUT ? We have unit-tests that >> check whether this happens and currently they work fine. I can only imagine >> that the problem occurs in larger datasets where we sample the data in some >> places. Can you describe a scenario/dataset where this happens? >> >> Best, >> Sebastian >> >> 2013/7/30 Rafal Lukawiecki <[email protected]> >> >>> I'm new here, just registered. Many thanks to everyone for working on an >>> amazing piece of software, thank you for building Mahout and for your >>> support. My apologies if this is not the right place to ask the question—I >>> have searched for the issue, and I can see this problem has been reported >>> here: >>> http://stackoverflow.com/questions/13822455/apache-mahout-distributed-recommender-recommends-already-rated-items >>> >>> Unfortunately, the trail leads to the newsgroups, and I have not found a >>> way, yet, to get an answer from them, without asking you. >>> >>> Essentially, I am running a Hadoop RecommenderJob from Mahout 0.7, and I >>> am finding that it is recommending items that the user has already >>> expressed a preference for in their input file. I understand that this >>> should not be happening, and I am not sure if there is a know fix or if I >>> should be looking for a workaround (such as using the entire input as the >>> filterFile). >>> >>> I will double-check that there is no error on my side, but so far it does >>> not seem that way. >>> >>> Many thanks and my regards from Ireland, >>> Rafal Lukawiecki >>> >>> -- >>> >>> Rafal Lukawiecki >>> >>> Strategic Consultant and Director >>> >>> Project Botticelli Ltd >>> >>>
