I managed to overcome the issue by using the famous hadoop trick, formatting the namenode and restarting hadoop. But still I have no clue what went wrong the first time but the problem was obviously with Hadoop.
$HADOOP_HOME/bin/stop-all.sh $HADOOP_HOME/bin/hadoop namenode -format $HADOOP_HOME/bin/start-all.sh Regards, On Fri, Jan 31, 2014 at 3:28 PM, Tharindu Rusira <[email protected]>wrote: > Hi all, > I'm running Mahout examples from the latest Mahout 0.9 release candidate. > I got this error while running ./cluster-reuters.sh with option 3 lda > clustering. As to the error log, this does not seem to be a Mahout issue > but Hadoop(1.2.1) fails to write to */tmp/mahout-work-tkumara/reuters-lda. > *This is however strange because /tmp/mahout-work-tkumara/ does not have > a *reuters-lda *directory and the exception stack trace complains that > the said directory already exists. > > 14/01/31 15:20:39 ERROR security.UserGroupInformation: > PriviledgedActionException as:tkumara > cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory > /tmp/mahout-work-tkumara/reuters-lda already exists > Exception in thread "main" > org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory > /tmp/mahout-work-tkumara/reuters-lda already exists > at > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:973) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:394) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:550) > at > org.apache.mahout.clustering.lda.cvb.CVB0Driver.writeTopicModel(CVB0Driver.java:441) > at > org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:336) > at > org.apache.mahout.clustering.lda.cvb.CVB0Driver.run(CVB0Driver.java:198) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at > org.apache.mahout.clustering.lda.cvb.CVB0Driver.main(CVB0Driver.java:534) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:160) > > I also checked relevant section in ./cluster-reuters.sh but could not find > anything there. > > elif [ "x$clustertype" == "xlda" ]; then > $MAHOUT seq2sparse \ > -i ${WORK_DIR}/reuters-out-seqdir/ \ > -o ${WORK_DIR}/reuters-out-seqdir-sparse-lda -ow --maxDFPercent 85 > --namedVector \ > && \ > $MAHOUT rowid \ > -i ${WORK_DIR}/reuters-out-seqdir-sparse-lda/tfidf-vectors \ > -o ${WORK_DIR}/reuters-out-matrix \ > && \ > rm -rf ${WORK_DIR}/reuters-lda ${WORK_DIR}/reuters-lda-topics > ${WORK_DIR}/reuters-lda-model \ > && \ > $MAHOUT cvb \ > -i ${WORK_DIR}/reuters-out-matrix/matrix \ > -o ${WORK_DIR}/reuters-lda -k 20 -ow -x 20 \ > -dict ${WORK_DIR}/reuters-out-seqdir-sparse-lda/dictionary.file-* \ > -dt ${WORK_DIR}/reuters-lda-topics \ > -mt ${WORK_DIR}/reuters-lda-model \ > && \ > $MAHOUT vectordump \ > -i ${WORK_DIR}/reuters-lda-topics/part-m-00000 \ > -o ${WORK_DIR}/reuters-lda/vectordump \ > -vs 10 -p true \ > -d ${WORK_DIR}/reuters-out-seqdir-sparse-lda/dictionary.file-* \ > -dt sequencefile -sort ${WORK_DIR}/reuters-lda-topics/part-m-00000 \ > && \ > cat ${WORK_DIR}/reuters-lda/vectordump > > So what would possibly be the reason for this exception? > Thanks, > -- > M.P. Tharindu Rusira Kumara > > Department of Computer Science and Engineering, > University of Moratuwa, > Sri Lanka. > +94757033733 > www.tharindu-rusira.blogspot.com > > -- M.P. Tharindu Rusira Kumara Department of Computer Science and Engineering, University of Moratuwa, Sri Lanka. +94757033733 www.tharindu-rusira.blogspot.com
