Hello, I'm trying to run the hadoop-based recommender job (org.apache.mahout.cf.taste.hadoop.item.RecommenderJob) from Mahout 0.8 on EMR. I'm using the "Amazon Distribution" Hadoop, which is version 1.0.3. Locally running the job with that version works just fine - I get the expected output.
On EMR, however, the job fails with the given exception: java.lang.NoSuchMethodError: org.apache.lucene.util.PriorityQueue.<init>(I)V (full stack trace: https://gist.github.com/adamw/6824585). Looking at the EMR documentation (http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html), the AMI contains Lucene 2.9.4, while Mahout uses 4.3.0. And indeed, in Lucene 2.x there's not PriorityQueue(int) constructor, while in Lucene 4.x there is. Is there some known way to solve this problem and run Mahout on EMR? I though about using a bootstrap action, but then replacing lucene will probably trigger a long chain of dependencies which would have to be updated as well. Adam -- Adam Warski http://twitter.com/#!/adamwarski http://www.softwaremill.com http://www.warski.org
