Hi Pat,

Thank you for your response.

My original input row (userID,productId,NoOfVisits) looks like this :
3601420184132,028V003838264000P,1

I transformed the product ids into long values since the original input is
a string. The transformed input looks something like this:
3601420184132,23423984,1

Since I need to use long ids, I set the 'usesLongIDs' flag to true while
running the 'parallelALS' job. While running the 'recommendfactorized' job,
I passed a path to the 'userIDIndex' and 'itemIdIndex' and set the
'usesLongIDs' flag to true. The resulting recommendations are all product
id '0'.

To validate that using 'long' ids is the problem, I passed the same
parameters as mentioned above to the “factorize-movielens-1M” example. Even
that is returning the value '0' for every product id in the
recommendations.

Regards,
Sneha


On Tue, Jul 8, 2014 at 10:35 AM, Pat Ferrel <[email protected]> wrote:

> I replied on stackoverflow.
>
> Did you translate your ids into mahout ids? Mahout ids must be ordinal
> integers for users and items. You will need to translate into mahout ids
> before the data is prepared correctly and translate into your application
> specific ids when reading the output. I updated the page you referenced to
> note this but it’s just a guess.
>
> Can you share a few lines of your input?
>
> On Jul 7, 2014, at 11:29 AM, Sneha Venkatesh <[email protected]> wrote:
>
> Hi,
>
> I am new to mahout and I building an implicit feedback recommender using
> the parallelALS job given here
> <https://mahout.apache.org/users/recommender/intro-als-hadoop.html>. Each
> row of my dataset consists of user_id, product_id, preference_score(which
> is the number of visits made by the user for the product). The user and
> product ids are of type long. I have a million data points of this kind
> after filtering out single or double visits.
>
> I have basically written a bash script that runs the two jobs “parallelALS”
> and “recommendfactorized” just as shown in the example
> “factorize-movielens-1M”. After running the script, the resulting
> recommendations seem to have a bug. The format of each row of the results
> (as explained in several blog posts) seems to be :-
> user_id [product_id:score,…]
>
> However all the products_ids in every row is 0. I am not sure what is going
> wrong here. Is this a problem with the dataset or a matter of tuning
> parameters (alpha,lambda, etc) or something else?
>
> Regards,
>
> Sneha
>
>

Reply via email to