I was finally able to get things running. I had completely overlooked the similarity algorithm parameter, and once I set that, it worked great! It'd be nice if mahout complained immediately when a required param is missing instead of when the param is needed, especially in multi-phase/map-reduce jobs.
- Matt On Sat, Aug 4, 2012 at 5:11 PM, Matt Mitchell <[email protected]> wrote: > I'm attempting to run the > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob on > AWS EMR. > > I can see that things are working by looking at the task logs. > However, after it runs for about 10 minutes, it dies. The only log > file is stdout, and it's empty. > > Does this look right -- using the ruby client: > > ./elastic-mapreduce -j JOB_ID --jar > s3n://mm.lib/mahout-core-0.6-job.jar --main-class > org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob > --arg --input --arg s3n://mm.input-data/data.csv --arg --output --arg > s3n://mm.output-data/ --arg --tempDir --arg tempDir4 --access-id > ACCESS_KEY --private-key PRIVATE_KEY > > One question... should the S3 output directory already exist? > > - Matt > > On Sat, Aug 4, 2012 at 3:18 PM, Matt Mitchell <[email protected]> wrote: >> Thanks :) Of course, I found this as soon as I posted! >> >> https://cwiki.apache.org/MAHOUT/mahout-on-elastic-mapreduce.html >> >> - Matt >> >> On Sat, Aug 4, 2012 at 2:34 PM, Sebastian Schelter <[email protected]> wrote: >>> Its pretty simple, upload the mahout jar and your data to S3 and click >>> together a custom mapreduce step pointing to the ItemSimilarityJob class >>> Am 04.08.2012 20:29 schrieb "Matt Mitchell" <[email protected]>: >>> >>>> Hi, >>>> >>>> I'm digging around trying to find info on running mahout on AWS's >>>> Elastic Map Reduce. Anyone know of a step-by-step article/tutorial? >>>> I'm interested in running "itemsimilarity", "recommenditembased" and >>>> "recommendfactorized". >>>> >>>> Thanks! >>>> >>>> - Matt >>>>
