[
https://issues.apache.org/jira/browse/SPARK-13198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134028#comment-15134028
]
Sean Owen commented on SPARK-13198:
-----------------------------------
I don't see evidence of a problem here yet. Stuff stays on heap until it is
GCed. Are you sure you triggered one like with a profiler and then measured?
You also generally would never stop and start a context in an app
> sc.stop() does not clean up on driver, causes Java heap OOM.
> ------------------------------------------------------------
>
> Key: SPARK-13198
> URL: https://issues.apache.org/jira/browse/SPARK-13198
> Project: Spark
> Issue Type: Bug
> Components: Mesos
> Affects Versions: 1.6.0
> Reporter: Herman Schistad
> Attachments: Screen Shot 2016-02-04 at 16.31.28.png, Screen Shot
> 2016-02-04 at 16.31.40.png, Screen Shot 2016-02-04 at 16.31.51.png
>
>
> When starting and stopping multiple SparkContext's linearly eventually the
> driver stops working with a "io.netty.handler.codec.EncoderException:
> java.lang.OutOfMemoryError: Java heap space" error.
> Reproduce by running the following code and loading in ~7MB parquet data each
> time. The driver heap space is not changed and thus defaults to 1GB:
> {code:java}
> def main(args: Array[String]) {
> val conf = new SparkConf().setMaster("MASTER_URL").setAppName("")
> conf.set("spark.mesos.coarse", "true")
> conf.set("spark.cores.max", "10")
> for (i <- 1 until 100) {
> val sc = new SparkContext(conf)
> val sqlContext = new SQLContext(sc)
> val events = sqlContext.read.parquet("hdfs://locahost/tmp/something")
> println(s"Context ($i), number of events: " + events.count)
> sc.stop()
> }
> }
> {code}
> The heap space fills up within 20 loops on my cluster. Increasing the number
> of cores to 50 in the above example results in heap space error after 12
> contexts.
> Dumping the heap reveals many equally sized "CoarseMesosSchedulerBackend"
> objects (see attachments). Digging into the inner objects tells me that the
> `executorDataMap` is where 99% of the data in said object is stored. I do
> believe though that this is beside the point as I'd expect this whole object
> to be garbage collected or freed on sc.stop().
> Additionally I can see in the Spark web UI that each time a new context is
> created the number of the "SQL" tab increments by one (i.e. last iteration
> would have SQL99). After doing stop and creating a completely new context I
> was expecting this number to be reset to 1 ("SQL").
> I'm submitting the jar file with `spark-submit` and no special flags. The
> cluster is running Mesos 0.23. I'm running Spark 1.6.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]