[
https://issues.apache.org/jira/browse/SPARK-15317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-15317:
------------------------------------
Assignee: Apache Spark
> JobProgressListener takes a huge amount of memory with iterative DataFrame
> program in local, standalone
> -------------------------------------------------------------------------------------------------------
>
> Key: SPARK-15317
> URL: https://issues.apache.org/jira/browse/SPARK-15317
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.0.0
> Environment: Spark 2.0, local mode + standalone mode on MacBook Pro
> OSX 10.9
> Reporter: Joseph K. Bradley
> Assignee: Apache Spark
> Attachments: cc_traces.txt, compare-1.6-10Kpartitions.png,
> compare-2.0-10Kpartitions.png, compare-2.0-16partitions.png,
> dump-standalone-2.0-1of4.png, dump-standalone-2.0-2of4.png,
> dump-standalone-2.0-3of4.png, dump-standalone-2.0-4of4.png
>
>
> h2. TL;DR
> Running a small test locally, I found JobProgressListener consuming a huge
> amount of memory. There are many tasks being run, but it is still
> surprising. Summary, with details below:
> * Spark app: series of DataFrame joins
> * Issue: GC
> * Heap dump shows JobProgressListener taking 150 - 400MB, depending on the
> Spark mode/version
> h2. Reproducing this issue
> h3. With more complex code
> The code which fails:
> * Here is a branch with the code snippet which fails:
> [https://github.com/jkbradley/spark/tree/18836174ab190d94800cc247f5519f3148822dce]
> ** This is based on Spark commit hash:
> bb1362eb3b36b553dca246b95f59ba7fd8adcc8a
> * Look at {{CC.scala}}, which implements connected components using
> DataFrames:
> [https://github.com/jkbradley/spark/blob/18836174ab190d94800cc247f5519f3148822dce/mllib/src/main/scala/org/apache/spark/ml/CC.scala]
> In the spark shell, run:
> {code}
> import org.apache.spark.ml.CC
> import org.apache.spark.sql.SQLContext
> val sqlContext = SQLContext.getOrCreate(sc)
> CC.runTest(sqlContext)
> {code}
> I have attached a file {{cc_traces.txt}} with the stack traces from running
> {{runTest}}. Note that I sometimes had to run {{runTest}} twice to cause the
> fatal exception. This includes a trace for 1.6, which should run without
> modifications to {{CC.scala}}. These traces are from running in local mode.
> I used {{jmap}} to dump the heap:
> * local mode with 2.0: JobProgressListener took about 397 MB
> * standalone mode with 2.0: JobProgressListener took about 171 MB (See
> attached screenshots from MemoryAnalyzer)
> Both 1.6 and 2.0 exhibit this issue. 2.0 ran faster, and the issue
> (JobProgressListener allocation) seems more severe with 2.0, though it could
> just be that 2.0 makes more progress and runs more jobs.
> h3. With simpler code
> I ran this with master (~Spark 2.0):
> {code}
> val data = spark.range(0, 10000, 1, 10000)
> data.cache().count()
> {code}
> The resulting heap dump:
> * 78MB for {{scala.tools.nsc.interpreter.ILoop$ILoopInterpreter}}
> * 58MB for {{org.apache.spark.ui.jobs.JobProgressListener}}
> * 80MB for {{io.netty.buffer.PoolChunk}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]