[
https://issues.apache.org/jira/browse/SPARK-6864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491437#comment-14491437
]
Sean Owen commented on SPARK-6864:
----------------------------------
I believe this is the *driver* process running out of memory. You have massive
executors but the driver is probably still on 512MB of RAM. Try increasing
that. I think everything else like your executors and data size is irrelevant
then and orders of magnitude larger than is needed for this data set.
> Spark's Multilabel Classifier runs out of memory on small datasets
> ------------------------------------------------------------------
>
> Key: SPARK-6864
> URL: https://issues.apache.org/jira/browse/SPARK-6864
> Project: Spark
> Issue Type: Test
> Components: MLlib
> Affects Versions: 1.2.1
> Environment: EC2 with 8-96 instances up to r3.4xlarge
> The test fails on every configuration
> Reporter: John Canny
> Fix For: 1.2.1
>
>
> When trying to run Spark's MultiLabel classifier
> (LogisticRegressionWithLBFGS) on the RCV1 V2 dataset (about 0.5GB, 100
> labels), the classifier runs out of memory. The number of tasks per executor
> doesnt seem to matter. It happens even with a single task per 120 GB
> executor. The dataset is the concatenation of the test files from the "rcv1v2
> (topics; full sets)" group here:
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multilabel.html
> Here's the code:
> import org.apache.spark.SparkContext
> import org.apache.spark.mllib.classification.LogisticRegressionWithLBFGS
> import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
> import org.apache.spark.mllib.optimization.L1Updater
> import org.apache.spark.mllib.regression.LabeledPoint
> import org.apache.spark.mllib.linalg.Vectors
> import org.apache.spark.mllib.util.MLUtils
> import scala.compat.Platform._
> val nnodes = 8
> val t0=currentTime
> // Load training data in LIBSVM format.
> val train = MLUtils.loadLibSVMFile(sc, "s3n://bidmach/RCV1train.libsvm",
> true, 276544, nnodes)
> val test = MLUtils.loadLibSVMFile(sc, "s3n://bidmach/RCV1test.libsvm", true,
> 276544, nnodes)
> val t1=currentTime;
> val lrAlg = new LogisticRegressionWithLBFGS()
> lrAlg.setNumClasses(100).optimizer.
> setNumIterations(10).
> setRegParam(1e-10).
> setUpdater(new L1Updater)
> // Run training algorithm to build the model
> val model = lrAlg.run(train)
> val t2=currentTime
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]