Hi Pat, Thank you for your response.
My original input row (userID,productId,NoOfVisits) looks like this : 3601420184132,028V003838264000P,1 I transformed the product ids into long values since the original input is a string. The transformed input looks something like this: 3601420184132,23423984,1 Since I need to use long ids, I set the 'usesLongIDs' flag to true while running the 'parallelALS' job. While running the 'recommendfactorized' job, I passed a path to the 'userIDIndex' and 'itemIdIndex' and set the 'usesLongIDs' flag to true. The resulting recommendations are all product id '0'. To validate that using 'long' ids is the problem, I passed the same parameters as mentioned above to the “factorize-movielens-1M” example. Even that is returning the value '0' for every product id in the recommendations. Regards, Sneha On Tue, Jul 8, 2014 at 10:35 AM, Pat Ferrel <[email protected]> wrote: > I replied on stackoverflow. > > Did you translate your ids into mahout ids? Mahout ids must be ordinal > integers for users and items. You will need to translate into mahout ids > before the data is prepared correctly and translate into your application > specific ids when reading the output. I updated the page you referenced to > note this but it’s just a guess. > > Can you share a few lines of your input? > > On Jul 7, 2014, at 11:29 AM, Sneha Venkatesh <[email protected]> wrote: > > Hi, > > I am new to mahout and I building an implicit feedback recommender using > the parallelALS job given here > <https://mahout.apache.org/users/recommender/intro-als-hadoop.html>. Each > row of my dataset consists of user_id, product_id, preference_score(which > is the number of visits made by the user for the product). The user and > product ids are of type long. I have a million data points of this kind > after filtering out single or double visits. > > I have basically written a bash script that runs the two jobs “parallelALS” > and “recommendfactorized” just as shown in the example > “factorize-movielens-1M”. After running the script, the resulting > recommendations seem to have a bug. The format of each row of the results > (as explained in several blog posts) seems to be :- > user_id [product_id:score,…] > > However all the products_ids in every row is 0. I am not sure what is going > wrong here. Is this a problem with the dataset or a matter of tuning > parameters (alpha,lambda, etc) or something else? > > Regards, > > Sneha > >
