Huned Lokhandwala created MAHOUT-1129:
-----------------------------------------

             Summary: Mahout Java Heap Out of Memory in TrainNewsGroup class
                 Key: MAHOUT-1129
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1129
             Project: Mahout
          Issue Type: Bug
          Components: Classification, Examples
    Affects Versions: 0.7
         Environment: Linux - RHEL 5.6
            Reporter: Huned Lokhandwala
            Priority: Critical
             Fix For: 0.7



On Mahout on Linux RHEL 5.6 (Single node, no cluster), a Java Heap Space Out of 
Memory Exception occurs when calling the TrainNewsGroup class from the 
mahout-examples-0.7.0.17-job.jar file. I downloaded the 20news-bydate.tar.gz 
(from http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz ) 
to use as classification data and added it in the folder as below, and called 
the org.apache.mahout.classifier.sgd.TrainNewsGroups on the folder and found 
the Java Heap Space Out of Memory Exception as show below.

Output from run:

> /usr/lib/mahout/bin/mahout org.apache.mahout.classifier.sgd.TrainNewsGroups 
> /artifacts/mahout_sgd_classifier/20news-bydate-train
Running on hadoop, using /usr/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7.0.17-job.jar
12/12/17 20:35:06 WARN driver.MahoutDriver: No 
org.apache.mahout.classifier.sgd.TrainNewsGroups.props found on classpath, will 
use command-line arguments only
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:50)
        at 
org.apache.mahout.classifier.sgd.OnlineLogisticRegression.<init>(OnlineLogisticRegression.java:60)
        at 
org.apache.mahout.classifier.sgd.CrossFoldLearner.<init>(CrossFoldLearner.java:67)
        at 
org.apache.mahout.classifier.sgd.CrossFoldLearner.copy(CrossFoldLearner.java:217)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression$Wrapper.copy(AdaptiveLogisticRegression.java:399)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression$Wrapper.copy(AdaptiveLogisticRegression.java:386)
        at org.apache.mahout.ep.State.copy(State.java:94)
        at org.apache.mahout.ep.State.mutate(State.java:114)
        at 
org.apache.mahout.ep.EvolutionaryProcess.initializePopulation(EvolutionaryProcess.java:103)
        at 
org.apache.mahout.ep.EvolutionaryProcess.<init>(EvolutionaryProcess.java:97)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.setupOptimizer(AdaptiveLogisticRegression.java:279)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.setPoolSize(AdaptiveLogisticRegression.java:265)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.<init>(AdaptiveLogisticRegression.java:121)
        at 
org.apache.mahout.classifier.sgd.AdaptiveLogisticRegression.<init>(AdaptiveLogisticRegression.java:100)
        at 
org.apache.mahout.classifier.sgd.TrainNewsGroups.main(TrainNewsGroups.java:100)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to