If you have a chance to build mahout from source, could you try if that works when the patch in MAHOUT-1329 applied? mvn packaging mahout with "-DskipTests=true" is pretty fast
Gokhan On Thu, Feb 20, 2014 at 11:34 PM, Suneel Marthi <suneel_mar...@yahoo.com>wrote: > > > > > > On Thursday, February 20, 2014 4:26 PM, "Zhang, Pengchu" < > pzh...@sandia.gov> wrote: > > Thanks, it has been executed successfully. Two more questions related to > this: > > 1. This means that I have to execute Mahout for further analysis with the > non-MR mode? > > No, that's not the case. There have been API changes between Hadoop 2.x > and Hadoop 1.x. > Mahout is not certified for Hadoop 2.x. > But most of Mahout's jobs work on Hadoop 2.x, seqdirectory could be one > job that's failing due to API incompatibility between Hadoop 1.x and Hadoop > 2.x > > > 2. It is too bad that Hadoop2.2. does not support for newer versions of > Mahout. Are you aware of that Hadoop 1.x working with Mahout 0.8 0r 0.9 on > MR? I do have a large dataset to be clustered. > > As mentioned earlier, Mahout 0.8/0.9 are certified for Hadoop 1.x so u > shouldn't be seeing any issues with that. > > Could u open a JIRA for this issue so that its trackable? As we r > now working towards Mahout 1.0 and Hadoop 2.x compatibility its good that > u have reported this issue. Thanks. > > > > Thanks. > > Pengchu > > > -----Original Message----- > From: Suneel Marthi [mailto:suneel_mar...@yahoo.com] > Sent: Thursday, February 20, 2014 1:17 PM > To: user@mahout.apache.org > Subject: [EXTERNAL] Re: Mapreduce job failed > > ... and the reason for this failing is that 'TaskAttemptContext' which was > a Class in Hadoop 1.x has now become an interface in Hadoop 2.2. > > Suggest that u execute this job in non-MR mode with '-xm > sequential'. > > > > > On Thursday, February 20, 2014 2:26 PM, Suneel Marthi < > suneel_mar...@yahoo.com> wrote: > > Seems like u r running this on HAdoop 2.2 (officially not supported for > Mahout 0.8 or 0.9), work around is to run this in sequential mode with "-xm > sequential". > > > > > > > > On Thursday, February 20, 2014 1:36 PM, "Zhang, Pengchu" < > pzh...@sandia.gov> wrote: > > Hello, I am trying to "seqdirirectory" with mahout (0.8 and 0.9) on my > Linux box with Hadoop (2.2.0) but keeping failed > consistently. > > I tested Hadoop with the Hadoop example pi and wordcount, both worked well. > > With a simple text file or directory with multiple text files, e.g., > Shakespeare_text, I got the same message of failure. > > >mahout seqdirectory --input /Shakespeare_txet --output > >/Shakespeare-seqdir --charset utf-8 > > $ mahout seqdirectory --input /shakespeare_text --output > /shakespeare-seqdir --charset utf-8 MAHOUT_LOCAL is not set; adding > HADOOP_CONF_DIR to classpath. > Running on hadoop, using /home/pzhang/hadoop-2.2.0/bin/hadoop and > HADOOP_CONF_DIR=/home/pzhang/hadoop-2.2.0/etc/hadoop > MAHOUT-JOB: /home/pzhang/MAHOUT_HOME/mahout-examples-0.8-job.jar > 14/02/20 11:29:42 INFO common.AbstractJob: Command line arguments: > {--charset=[utf-8], --chunkSize=[64], > --endPhase=[2147483647], > --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], > --input=[/shakespeare_text], --keyPrefix=[], --method=[mapreduce], > --output=[/shakespeare-seqdir], --startPhase=[0], --tempDir=[temp]} > 14/02/20 11:29:42 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.input.dir is > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir > 14/02/20 11:29:42 INFO Configuration.deprecation: > mapred.compress.map.output is deprecated. Instead, use > mapreduce.map.output.compress > 14/02/20 11:29:42 INFO Configuration.deprecation: mapred.output.dir is > deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir > 14/02/20 11:29:43 INFO client.RMProxy: Connecting to ResourceManager at / > 0.0.0.0:8032 > 14/02/20 11:29:44 INFO input.FileInputFormat: Total input paths to process > : 43 > 14/02/20 11:29:44 INFO input.CombineFileInputFormat: DEBUG: Terminated > node allocation with : CompletedNodes: 1, size left: 5284832 > 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: number of splits:1 > 14/02/20 11:29:44 INFO Configuration.deprecation: user.name is > deprecated. Instead, use mapreduce.job.user.name > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.compress > is deprecated. Instead, use mapreduce.output.fileoutputformat.compress > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.jar is > deprecated. Instead, use mapreduce.job.jar > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.reduce.tasks is > deprecated. Instead, use mapreduce.job.reduces > 14/02/20 11:29:44 INFO Configuration.deprecation: > mapred.output.value.class is > deprecated. Instead, use mapreduce.job.output.value.class > 14/02/20 11:29:44 INFO Configuration.deprecation: > mapred.mapoutput.value.class is deprecated. Instead, use > mapreduce.map.output.value.class > 14/02/20 11:29:44 INFO Configuration.deprecation: mapreduce.map.class is > deprecated. Instead, use mapreduce.job.map.class > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.job.name is > deprecated. Instead, use mapreduce.job.name > 14/02/20 11:29:44 INFO Configuration.deprecation: > mapreduce.inputformat.class is deprecated. Instead, use > mapreduce.job.inputformat.class > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.max.split.size is > deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize > 14/02/20 11:29:44 INFO Configuration.deprecation: > mapreduce.outputformat.class is deprecated. Instead, use > mapreduce.job.outputformat.class > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.map.tasks is > deprecated. Instead, use mapreduce.job.maps > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.output.key.class > is deprecated. Instead, use mapreduce.job.output.key.class > 14/02/20 11:29:44 INFO Configuration.deprecation: > mapred.mapoutput.key.class is deprecated. Instead, use > mapreduce.map.output.key.class > 14/02/20 11:29:44 INFO Configuration.deprecation: mapred.working.dir is > deprecated. Instead, use mapreduce.job.working.dir > 14/02/20 11:29:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: > job_1392919123773_0004 > 14/02/20 11:29:45 INFO impl.YarnClientImpl: Submitted application > application_1392919123773_0004 to ResourceManager at /0.0.0.0:8032 > 14/02/20 11:29:45 INFO mapreduce.Job: The url to track the job: > http://savm0072lx.sandia.gov:8088/proxy/application_1392919123773_0004/ > 14/02/20 11:29:45 INFO mapreduce.Job: Running job: job_1392919123773_0004 > 14/02/20 11:29:53 INFO mapreduce.Job: Job job_1392919123773_0004 running > in uber mode : false > 14/02/20 11:29:53 INFO mapreduce.Job: map 0% reduce 0% > 14/02/20 11:29:58 INFO mapreduce.Job: Task Id : > attempt_1392919123773_0004_m_000000_0, Status : FAILED > Error: java.lang.RuntimeException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164) > at > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126) > at > org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:491) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:734) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:416) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at > org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:534) > at > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155) > ... 10 more > Caused by: java.lang.IncompatibleClassChangeError: Found interface > org.apache.hadoop.mapreduce.TaskAttemptContext, but class was > expected > at > org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:52) > ... 15 more > > Any suggestion is helpful. > > Thanks. > > Pengchu >