[
https://issues.apache.org/jira/browse/SPARK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-6844:
-----------------------------------
Assignee: (was: Apache Spark)
> Memory leak occurs when register temp table with cache table on
> ---------------------------------------------------------------
>
> Key: SPARK-6844
> URL: https://issues.apache.org/jira/browse/SPARK-6844
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0
> Reporter: Jack Hu
> Labels: Memory, SQL
>
> There is a memory leak in register temp table with cache on
> This is the simple code to reproduce this issue:
> {code}
> val sparkConf = new SparkConf().setAppName("LeakTest")
> val sparkContext = new SparkContext(sparkConf)
> val sqlContext = new SQLContext(sparkContext)
> val tableName = "tmp"
> val jsonrdd = sparkContext.textFile("""sample.json""")
> var loopCount = 1L
> while(true) {
> sqlContext.jsonRDD(jsonrdd).registerTempTable(tableName)
> sqlContext.cacheTable(tableName)
> println("L: " +loopCount + " R:" + sqlContext.sql("""select count(*)
> from tmp""").count())
> sqlContext.uncacheTable(tableName)
> loopCount += 1
> }
> {code}
> The cause is that the {{InMemoryRelation}}. {{InMemoryColumnarTableScan}}
> uses the accumulator
> ({{InMemoryRelation.batchStats}},{{InMemoryColumnarTableScan.readPartitions}},
> {{InMemoryColumnarTableScan.readBatches}} ) to get some information from
> partitions or for test. These accumulators will register itself into a static
> map in {{Accumulators.originals}} and never get cleaned up.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]