Found the issue. Had forgotten to include the flag --usesLongIDs in the second job. :P
The correct command is: mahout recommendfactorized --input /mahout/output_data/alsrecommender_longs/userRatings --userFeatures /mahout/output_data/alsrecommender_longs/U/ --itemFeatures /mahout/output_data/alsrecommender_longs/M/ --usesLongIDs true --itemIDIndex /mahout/output_data/alsrecommender_longs/itemIDIndex/ --userIDIndex /mahout/output_data/alsrecommender_longs/userIDIndex/ --numRecommendations 100 --output /mahout/output_data/alsrecommender_longs/user_recommendations_longs --maxRating 1 --numThreads 4 --tempDir /mahout/tmp -Dmapred.reduce.tasks=256 -Nilesh (:& "Tongue tied, speechless") On Wednesday, September 3, 2014 11:27 AM, Nil Kulkarni <[email protected]> wrote: hi folks, I was trying to get the ALS recommender working for our data. I have user_ids and item_ids that are longs. To handle this, I set the usesLongIDs ‘true’ in the parallelALS job first. mahout parallelALS --input /mahout/input_data/user_item_interest/ --output /mahout/output_data/alsrecommender_longs/ --lambda 0.1 --implicitFeedback true --alpha 0.1 --numFeatures 20 --numIterations 10 --numThreadsPerSolver 4 --usesLongIDs true --tempDir /mahout/tmp -Dmapred.reduce.tasks=256 This job created the output folders "/userIDIndex" and "/itemIDIndex" along with the "/U" and the "/M" latent factor folders. Then,I gave these paths for the generated userIDIndex and itemIDIndex to the recommendfactorized job. mahout recommendfactorized --input /mahout/output_data/alsrecommender_longs/userRatings --userFeatures /mahout/output_data/alsrecommender_longs/U/ --itemFeatures /mahout/output_data/alsrecommender_longs/M/ --itemIDIndex /mahout/output_data/alsrecommender_longs/itemIDIndex/ --userIDIndex /mahout/output_data/alsrecommender_longs/userIDIndex/ --numRecommendations 100 --output /mahout/output_data/alsrecommender_longs/user_recommendations_longs --maxRating 1 --numThreads 4 --tempDir /mahout/tmp -Dmapred.reduce.tasks=256 Both the jobs completed successfully. I was assuming logically that since my input user and item ids were longs, the computed user recommendations would have the user_ids and recommended item_ids as the original longs. However, the output seem to still hold the recommendations in the ints that it internally created as part of its first job. Is there any other parameter to be set to get back the recommendations in their original long ids. It seems absurd that the "factorizedrecommender" job does not have post-processing step that would map the ints back to the longs. Thanks, Nilesh
