Joe, I don't think it is the disk space that could be the problem because I did have enough disk space (well, not 81GB, but around 40GB free) . I will try if the suggestions in the thread you mentioned make any difference. Will keep you posted.
Thank you On Fri, Sep 17, 2010 at 11:33 PM, Joe Kumar <[email protected]> wrote: > Gangadhar, > > I couldnt find any concrete reason behind this error. Some of them have > reported this to happen very sporadic. As per some suggestions in this > thread ( > http://www.mail-archive.com/[email protected]/msg09250.html) , I > have changed the location of hadoop tmp dir. Also I have cleaned up some > space in my laptop (now having 81GB of free space) and have started the job > again. I m trying to see if freeing up space helps. I'll post any progress. > > Has anyone else faced similar issues. Would appreciate feedbacks / thots. > > reg > Joe. > > > On Fri, Sep 17, 2010 at 8:36 PM, Gangadhar Nittala > <[email protected]>wrote: > >> Thank you Joe for the confirmation. I am also checking the code to see >> what is causing this issue. May be others in the list will know what >> can cause this issue. I am guessing the root cause is not Mahout but >> something in Hadoop. >> >> On Thu, Sep 16, 2010 at 11:34 PM, Joe Kumar <[email protected]> wrote: >> > Gangadhar, >> > >> > After some system issues, I finally ran the TrainClassifier. After almost >> > 65% into the map job, I got the same error that you have mentioned. >> > INFO mapred.JobClient: Task Id : attempt_201009160819_0002_m_000000_0, >> > Status : FAILED >> > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any >> > valid local directory for >> > >> taskTracker/jobcache/job_201009160819_0002/attempt_201009160819_0002_m_000000_0/output/file.out >> > at >> > >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) >> > ... >> > Havent yet analyzed the root cause / solution but just wanted to confirm >> > that I am facing the same issue as you do. >> > I'll try to search / analyze and post more details. >> > >> > reg, >> > Joe. >> > >> > On Wed, Sep 15, 2010 at 10:20 PM, Joe Kumar <[email protected]> wrote: >> > >> >> Hi Gangadhar, >> >> >> >> rite. I did the same to execute the TrainClassifier but then since the >> >> default datasource is hdfs, we should not be mandated to provide this >> >> parameter. >> >> I havent completed executing the TrainClassifier yet. I'll do it tonite >> and >> >> let you know if I get into trouble. >> >> >> >> reg, >> >> Joe. >> >> >> >> >> >> On Wed, Sep 15, 2010 at 9:41 PM, Gangadhar Nittala < >> >> [email protected]> wrote: >> >> >> >>> I ran into the issue that Joe mentioned about the command line >> >>> parameters. I just added the datasource to the command line to execute >> >>> thus >> >>> $HADOOP_HOME/bin/hadoop jar >> >>> $MAHOUT_HOME/examples/target/mahout-examples-0.4-SNAPSHOT.job >> >>> org.apache.mahout.classifier.bayes.TrainClassifier --gramSize 3 >> >>> --input wikipediainput10 --output wikipediamodel10 --classifierType >> >>> bayes --dataSource hdfs >> >>> >> >>> On a related note, Joe, were you able to run the TrainClassifier >> >>> without any errors ? When I tried this, the map-reduce job would abort >> >>> always at 99%. I tried the example that was given in the wiki with >> >>> both subjects and countries. I even reduced the list of countries in >> >>> the country.txt assuming that was what was causing the issue. No >> >>> matter what, the classifier task fails. And the exception in the task >> >>> log : >> >>> >> >>> 10-09-14 08:25:27,026 INFO org.apache.hadoop.mapred.MapTask: bufstart >> >>> = 41271492; bufend = 58259002; bufvoid = 99614720 >> >>> 2010-09-14 08:25:27,026 INFO org.apache.hadoop.mapred.MapTask: kvstart >> >>> = 196379; kvend = 130842; length = 327680 >> >>> 2010-09-14 08:25:48,136 INFO org.apache.hadoop.mapred.MapTask: >> >>> Finished spill 287 >> >>> 2010-09-14 08:25:48,417 INFO org.apache.hadoop.mapred.MapTask: >> >>> Starting flush of map output >> >>> 2010-09-14 08:26:00,386 INFO org.apache.hadoop.mapred.MapTask: >> >>> Finished spill 288 >> >>> 2010-09-14 08:26:08,765 WARN org.apache.hadoop.mapred.TaskTracker: >> >>> Error running child >> >>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find >> >>> any valid local directory for >> >>> >> >>> >> taskTracker/jobcache/job_201009132133_0002/attempt_201009132133_0002_m_000001_3/output/file.out >> >>> at >> >>> >> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343) >> >>> at >> >>> >> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124) >> >>> at >> >>> >> org.apache.hadoop.mapred.MapOutputFile.getOutputFileForWrite(MapOutputFile.java:61) >> >>> at >> >>> >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1469) >> >>> at >> >>> >> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1154) >> >>> at >> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359) >> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) >> >>> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> >>> >> >>> I checked the hadoop JIRA and this seems to be fixed already >> >>> https://issues.apache.org/jira/browse/HADOOP-4963. I am not sure what >> >>> I am doing wrong. Any suggestions to what I need to change to get this >> >>> fixed will be very helpful. I have been struggling with this for a >> >>> while now. >> >>> >> >>> Thank you >> >>> >> >>> On Wed, Sep 15, 2010 at 1:16 AM, Joe Kumar <[email protected]> wrote: >> >>> > Robin, >> >>> > >> >>> > sure. I'll submit a patch. >> >>> > >> >>> > The command line flag already has the default behavior specified. >> >>> > --classifierType (-type) classifierType Type of classifier: >> >>> > bayes|cbayes. >> >>> > Default: bayes >> >>> > >> >>> > --dataSource (-source) dataSource Location of model: >> >>> hdfs|hbase. >> >>> > >> >>> > Default Value: hdfs >> >>> > So there is no change in the flag description. >> >>> > >> >>> > reg, >> >>> > Joe. >> >>> > >> >>> > >> >>> > On Wed, Sep 15, 2010 at 1:10 AM, Robin Anil <[email protected]> >> >>> wrote: >> >>> > >> >>> >> On Wed, Sep 15, 2010 at 10:26 AM, Joe Kumar <[email protected]> >> >>> wrote: >> >>> >> >> >>> >> > Hi all, >> >>> >> > >> >>> >> > As I was going through wikipedia example, I encountered a >> situation >> >>> with >> >>> >> > TrainClassifier wherein some of the options with default values >> are >> >>> >> > actually >> >>> >> > mandatory. >> >>> >> > The documentation / command line help says that >> >>> >> > >> >>> >> > 1. default source (--datasource) is hdfs but TrainClassifier >> >>> >> > has withRequired(true) while building the --datasource option. >> We >> >>> are >> >>> >> > checking if the dataSourceType is hbase else set it to hdfs. so >> >>> >> > ideally withRequired should be set to false >> >>> >> > 2. default --classifierType is bayes but withRequired is set to >> >>> true >> >>> >> and >> >>> >> > we have code like >> >>> >> > >> >>> >> > if ("bayes".equalsIgnoreCase(classifierType)) { >> >>> >> > log.info("Training Bayes Classifier"); >> >>> >> > trainNaiveBayes(inputPath, outputPath, params); >> >>> >> > >> >>> >> > } else if ("cbayes".equalsIgnoreCase(classifierType)) { >> >>> >> > log.info("Training Complementary Bayes Classifier"); >> >>> >> > // setup the HDFS and copy the files there, then run the >> >>> trainer >> >>> >> > trainCNaiveBayes(inputPath, outputPath, params); >> >>> >> > } >> >>> >> > >> >>> >> > which should be changed to >> >>> >> > >> >>> >> > *if ("cbayes".equalsIgnoreCase(classifierType)) {* >> >>> >> > log.info("Training Complementary Bayes Classifier"); >> >>> >> > trainCNaiveBayes(inputPath, outputPath, params); >> >>> >> > >> >>> >> > } *else {* >> >>> >> > log.info("Training Bayes Classifier"); >> >>> >> > // setup the HDFS and copy the files there, then run the >> >>> trainer >> >>> >> > trainNaiveBayes(inputPath, outputPath, params); >> >>> >> > } >> >>> >> > >> >>> >> > Please let me know if this looks valid and I'll submit a patch for >> a >> >>> JIRA >> >>> >> > issue. >> >>> >> > >> >>> >> > +1 all valid. , Go ahead and fix it and in the cmdline flags write >> >>> the >> >>> >> default behavior in the flag description >> >>> >> >> >>> >> >> >>> >> > reg >> >>> >> > Joe. >> >>> >> > >> >>> >> >> >>> > >> >>> >> >> >> >> >> >> >> >> >> >> >> > >> >
