Ah, nice! That's new -- er very recently update. Cool. Thanks. On Tue, May 18, 2010 at 6:50 PM, Jeff Eastman <[email protected]>wrote:
> I'm running on a Cloudera Ubuntu based AMI that I subsequently configured > as in https://cwiki.apache.org/confluence/display/MAHOUT/MahoutEC2 > > Jeff > > > > On 5/18/10 6:37 PM, Mike Roberts wrote: > >> Nuts, and I was just about to finish my >> >> *"A Complete Newb’s Guide to (Installing on EC2) and Actually Running >> Mahout >> from the Command Line" *wiki post. >> >> Now, I'll have to see where I went wrong. Which distro are you running? >> I >> started with an Alestic Ubuntu 10.4 AMI (ami-cb97c68e). >> >> On Tue, May 18, 2010 at 5:34 PM, Jeff Eastman<[email protected] >> >wrote: >> >> >> >>> I also brought up a single instance at >>> http://ec2-184-73-30-93.compute-1.amazonaws.com:50030/jobtracker.jsp and >>> that ran fine too. It looks to me like the problem, whatever it is, is in >>> your AMI or its configuration. >>> >>> Jeff >>> >>> >>> >>> On 5/18/10 5:15 PM, Jeff Eastman wrote: >>> >>> >>> >>>> Welll, I just brought up a 2 node cluster at >>>> >>>> http://ec2-174-129-148-227.compute-1.amazonaws.com:50030/jobtracker.jspandit >>>> ran fine. >>>> >>>> >>>> >>>> On 5/18/10 4:56 PM, Mike Roberts wrote: >>>> >>>> >>>> >>>>> Single instance. Thx. >>>>> >>>>> On Tue, May 18, 2010 at 4:49 PM, Jeff Eastman< >>>>> [email protected] >>>>> >>>>> >>>>>> wrote: >>>>>> >>>>>> >>>>> Hi Mike, >>>>> >>>>> >>>>>> Shouldn't happen. You running this on a single instance or on a hadoop >>>>>> cluster? I will see if I can duplicate. >>>>>> >>>>>> Jeff >>>>>> >>>>>> >>>>>> On 5/18/10 4:27 PM, Mike Roberts wrote: >>>>>> >>>>>> Hey Guys, >>>>>> >>>>>> >>>>>>> Just trying to get the example mentioned here working: >>>>>>> https://cwiki.apache.org/MAHOUT/parallelfrequentpatternmining.html. >>>>>>> >>>>>>> I downloaded the accidents.dat file and placed it in >>>>>>> /home/ubuntu/mahout-in/fpm-input. >>>>>>> I created a directory for the output as >>>>>>> /home/ubuntu/mahout-in/fpm-out. >>>>>>> Then, I ran the following command: >>>>>>> ./bin/mahout fpg --input /home/ubuntu/mahout-in/fpm-input --output >>>>>>> /home/ubuntu/mahout-in/fpm-out --method mapreduce >>>>>>> >>>>>>> It runs for a bit and after the first step I get the following error: >>>>>>> >>>>>>> java.io.IOException: java.lang.ClassNotFoundException: >>>>>>> org.apache.mahout.common.Pair >>>>>>> at >>>>>>> >>>>>>> >>>>>>> org.apache.hadoop.io.serializer.JavaSerialization$JavaSerializationDeserializer.deserialize(JavaSerialization.java:55) >>>>>>> >>>>>>> at >>>>>>> >>>>>>> >>>>>>> org.apache.hadoop.io.serializer.JavaSerialization$JavaSerializationDeserializer.deserialize(JavaSerialization.java:36) >>>>>>> >>>>>>> at >>>>>>> >>>>>>> >>>>>>> org.apache.hadoop.io.DefaultStringifier.fromString(DefaultStringifier.java:75) >>>>>>> >>>>>>> at >>>>>>> >>>>>>> >>>>>>> org.apache.mahout.fpm.pfpgrowth.PFPGrowth.deserializeList(PFPGrowth.java:84) >>>>>>> >>>>>>> at >>>>>>> >>>>>>> >>>>>>> org.apache.mahout.fpm.pfpgrowth.TransactionSortingMapper.setup(TransactionSortingMapper.java:77) >>>>>>> >>>>>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) >>>>>>> at >>>>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >>>>>>> at >>>>>>> >>>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >>>>>>> >>>>>>> >>>>>>> >>>>>>> The step that it was running: >>>>>>> 10/05/18 23:10:18 INFO pfpgrowth.PFPGrowth: No of Features: 30 >>>>>>> 10/05/18 23:10:18 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics >>>>>>> with >>>>>>> processName=JobTracker, sessionId= - already initialized >>>>>>> 10/05/18 23:10:18 WARN mapred.JobClient: Use GenericOptionsParser for >>>>>>> parsing the arguments. Applications should implement Tool for the >>>>>>> same. >>>>>>> 10/05/18 23:10:19 INFO input.FileInputFormat: Total input paths to >>>>>>> process >>>>>>> : >>>>>>> 1 >>>>>>> 10/05/18 23:10:19 INFO mapred.JobClient: Running job: job_local_0002 >>>>>>> 10/05/18 23:10:19 INFO input.FileInputFormat: Total input paths to >>>>>>> process >>>>>>> : >>>>>>> 1 >>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: io.sort.mb = 100 >>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: data buffer = >>>>>>> 79691776/99614720 >>>>>>> 10/05/18 23:10:19 INFO mapred.MapTask: record buffer = 262144/327680 >>>>>>> 10/05/18 23:10:19 WARN mapred.LocalJobRunner: job_local_0002 >>>>>>> >>>>>>> Anyone know what's going on here, or have a solution? I verified >>>>>>> that >>>>>>> the >>>>>>> class file (Pair.Java) exists in >>>>>>> /trunk/core/src/main/java/org/apache/mahout/common. I did an mvn >>>>>>> install >>>>>>> in >>>>>>> core just to be sure. I'm running Hadoop 20.2 on Ubuntu 10.4 on EC2. >>>>>>> BTW, >>>>>>> if it's not obvious, I'm a total Mahout n00b. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Mike >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>> >>> >> >> > >
