hi lance,
 
I can't express my appreciation to you too much.
Lance, thank you very much!
 
enyun
 
 > -----原始邮件-----
> 发件人: "Lance Norskog" <[email protected]>
> 发送时间: 2012年1月1日 星期日
> 收件人: [email protected]
> 抄送: 
> 主题: Re: how to deal with out-of-memory issue of bayes/testclassifier?
> 
> There are two answers:
> 
> First answer: you are using the "old" Bayes classifier.
> TrainNaiveBayes and TestNaiveBayes are newer and apparently work
> better (I cannot tell you how). TrainNaiveBayes reads the entire model
> in one program at the end of the training pass, so this surprise will
> not happen. 'mahout trainnb' and 'mahout testnb'.
> 
> Second answer: you are training with too much data.  Try a smaller
> corpus, or use the minSupport and minDf parameters to limit the terms
> you train against.
> 
> 
> 2011/12/30 enyun <[email protected]>:
> > hi all,
> >
> > I'm using mahout bayes model to predict some new data.
> > After I got the model by 'trainclassifier', I found this model would cause 
> > out-of-memory when I was using 'testclassifer'.
> > I have tried to enlarge my java heap size to 4g, but it still did not work.
> > I felt it was very strange of trainclassifer's working well while 
> > testclassifer's not working.
> > Do you know how to deal with this issue?
> >
> > 'java.lang.OutOfMemoryError: Java heap space
> >    at 
> > org.apache.mahout.math.map.OpenObjectIntHashMap.rehash(OpenObjectIntHashMap.java:435)
> >    at 
> > org.apache.mahout.math.map.OpenObjectIntHashMap.put(OpenObjectIntHashMap.java:387)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.getFeatureID(InMemoryBayesDatastore.java:131)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.setSumFeatureWeight(InMemoryBayesDatastore.java:153)
> >    at 
> > org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:82)
> >    at 
> > org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
> >    at 
> > org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
> >    at 
> > org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:130)
> >    ... 22 more'
> >
> > thanks,
> >
> >
> > mahout testclassifier -m /user/mahoutTest//bayes-model -d 
> > /user/enyun/mahoutTest//bayes-test-input -type bayes -ng 1 -source hdfs 
> > -method mapreduce
> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > Running on hadoop, using 
> > HADOOP_HOME=/home/work/Programs/hadoop/hadoop-0.20.203.0/
> > HADOOP_CONF_DIR=/home/work/Programs/hadoop/hadoop-0.20.203.0//conf
> > MAHOUT-JOB: 
> > /home/work/code/hadoop/mahout/mahout-trunk/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar
> > 11/12/30 16:22:59 WARN driver.MahoutDriver: No testclassifier.props found 
> > on classpath, will use command-line arguments only
> > 11/12/30 16:23:00 INFO common.HadoopUtil: Deleting 
> > /user/mahoutTest/bayes-test-input-output
> > 11/12/30 16:23:01 WARN mapred.JobClient: Use GenericOptionsParser for 
> > parsing the arguments. Applications should implement Tool for the same.
> > 11/12/30 16:23:01 INFO mapred.FileInputFormat: Total input paths to process 
> > : 2
> > 11/12/30 16:23:02 INFO mapred.JobClient: Running job: job_201112231028_0058
> > 11/12/30 16:23:03 INFO mapred.JobClient:  map 0% reduce 0%
> > 11/12/30 16:23:28 INFO mapred.JobClient: Task Id : 
> > attempt_201112231028_0058_m_000000_0, Status : FAILED
> > java.lang.RuntimeException: Error in configuring object
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:431)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
> >    at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:396)
> >    at 
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> >    at org.apache.hadoop.mapred.Child.main(Child.java:253)
> > Caused by: java.lang.reflect.InvocationTargetException
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >    ... 9 more
> > Caused by: java.lang.RuntimeException: Error in configuring object
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> >    ... 14 more
> > Caused by: java.lang.reflect.InvocationTargetException
> >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >    at 
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >    at 
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >    at java.lang.reflect.Method.invoke(Method.java:597)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> >    ... 17 more
> > Caused by: java.lang.OutOfMemoryError: Java heap space
> >    at 
> > org.apache.mahout.math.map.OpenObjectIntHashMap.rehash(OpenObjectIntHashMap.java:435)
> >    at 
> > org.apache.mahout.math.map.OpenObjectIntHashMap.put(OpenObjectIntHashMap.java:387)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.getFeatureID(InMemoryBayesDatastore.java:131)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.setSumFeatureWeight(InMemoryBayesDatastore.java:153)
> >    at 
> > org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadFeatureWeights(SequenceFileModelReader.java:82)
> >    at 
> > org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:46)
> >    at 
> > org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
> >    at 
> > org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
> >    at 
> > org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:130)
> >    ... 22 more
> >
> > 11/12/30 16:23:28 INFO mapred.JobClient: Task Id : 
> > attempt_201112231028_0058_m_000001_0, Status : FAILED
> > java.lang.RuntimeException: Error in configuring object
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> >    at 
> > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> >    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:431)
> >    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
> >    at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
> >    at java.security.AccessController.doPrivileged(Native Method)
> >    at javax.security.auth.Subject.doAs(Subject.java:396)
> >    at 
> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
> >
> 
> 
> 
> -- 
> Lance Norskog
> [email protected]

Reply via email to