Hi Suneel, Thanks for the information. I converted the .sgm files to text files and ran the seqdirectory job. However, I still get the exception "java.lang.ClassNotFoundException: org.apache.mahout.common.AbstractJob".
hadoop jar /Users/rohitp/Desktop/rohitp/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar org.apache.mahout.text.SequenceFilesFromDirectory -i LDA/reuter_example/reuters-out/ -o LDA/reuter_example/reuters-out_seq/ Thanks, Rohit On Mon, Jun 23, 2014 at 5:52 PM, Suneel Marthi <[email protected]> wrote: > You need to first convert *.sgm from reuters download to text files (this > shuld happen before running seqdirectory). > > To convert .sgm to text run - "$MAHOUT > org.apache.lucene.benchmark.utils.ExtractReuters ${WORK_DIR}/reuters-sgm > ${WORK_DIR}/reuters-out" > > Then run seqdirectory on the output of the previous step. > > > On Mon, Jun 23, 2014 at 6:43 PM, Parimi Rohit <[email protected]> > wrote: > > > Hi All, > > > > I am trying to run LDA from Mahout and as a first step I wanted to run > the > > "SequenceFilesFromDirectory" job to convert the text files into sequence > > files. Following is the command I am using: > > > > hadoop jar > > > > > /Users/rohitp/Desktop/rohitp/Downloads/mahout-distribution-0.9/mahout-examples-0.9-job.jar > > org.apache.mahout.text.SequenceFilesFromDirectory -i > > LDA/reuter_example/reuters-sgm/ -o LDA/reuter_example/reuters-sgm_seq/ > > > > > > > > However, I get the following class not found exception. I also tried to > use > > the mahout driver program but got the same exception (mahout seqdirectory > > -i LDA/reuter_example/reuters-sgm/ -o > LDA/reuter_example/reuters-sgm_seq/). > > > > > > Hadoop Version: Hadoop 1.2.1 > > > > Mahout version: 0.9 > > > > > > Any help is much appreciated. > > > > > > > > Rohit > > > > > > > > java.lang.RuntimeException: java.lang.reflect.InvocationTargetException > > > > at > > > > > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:164) > > > > at > > > > > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.<init>(CombineFileRecordReader.java:126) > > > > at > > > > > org.apache.mahout.text.MultipleTextFileInputFormat.createRecordReader(MultipleTextFileInputFormat.java:43) > > > > at > > > > > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:488) > > > > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731) > > > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) > > > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at javax.security.auth.Subject.doAs(Subject.java:394) > > > > at > > > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > > > > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > > Caused by: java.lang.reflect.InvocationTargetException > > > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > > > > at > > > > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) > > > > at > > > > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) > > > > at java.lang.reflect.Constructor.newInstance(Constructor.java:513) > > > > at > > > > > org.apache.hadoop.mapreduce.lib.input.CombineFileRecordReader.initNextRecordReader(CombineFileRecordReader.java:155) > > > > ... 10 more > > > > Caused by: java.lang.NoClassDefFoundError: > > org/apache/mahout/common/AbstractJob > > > > at java.lang.ClassLoader.defineClass1(Native Method) > > > > at java.lang.ClassLoader.defineClassCond(ClassLoader.java:637) > > > > at java.lang.ClassLoader.defineClass(ClassLoader.java:621) > > > > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141) > > > > at java.net.URLClassLoader.defineClass(URLClassLoader.java:283) > > > > at java.net.URLClassLoader.access$000(URLClassLoader.java:58) > > > > at java.net.URLClassLoader$1.run(URLClassLoader.java:197) > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:306) > > > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:247) > > > > at > > > > > org.apache.mahout.text.WholeFileRecordReader.<init>(WholeFileRecordReader.java:61) > > > > ... 15 more > > >
