yangping wu created SPARK-7353:
----------------------------------
Summary: Driver memory leak?
Key: SPARK-7353
URL: https://issues.apache.org/jira/browse/SPARK-7353
Project: Spark
Issue Type: Bug
Components: Streaming
Affects Versions: 1.3.1
Reporter: yangping wu
Hi all, I am using Spark Streaming to read data from kafka, My spark version is
1.3.1,the code as follow:
{code}
object Test {
def main(args: Array[String]) {
val brokerAddress =
"192.168.246.66:9092,192.168.246.67:9092,192.168.246.68:9092"
val kafkaParams = Map[String, String](
"metadata.broker.list" -> brokerAddress,
"group.id" -> args(0))
val sparkConf = new SparkConf().setAppName("Test")
sparkConf.set("spark.kryo.registrator", "utils.MyKryoSerializer")
val sc = new SparkContext(sparkConf)
val ssc = new StreamingContext(sc, Seconds(2))
val topicsSet = Set("sparktopic")
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder,
StringDecoder](ssc, kafkaParams, topicsSet)
messages.foreachRDD(rdd =>{
if(!rdd.isEmpty()){
rdd.count()
}
})
ssc.start()
ssc.awaitTermination()
ssc.stop()
}
}
{code}
The program already run about 120 hours, below is *jmap -histo:live* result for
the program:
{code}
num #instances #bytes class name
----------------------------------------------
1: 30148 139357920 [B
2: 2102205 67270560 java.util.HashMap$Entry
3: 2143056 51433344 java.lang.Long
4: 520430 26570456 [C
5: 119224 15271104 <methodKlass>
6: 119224 14747984 <constMethodKlass>
7: 3449 13476384 [Ljava.util.HashMap$Entry;
8: 519132 12459168 java.lang.String
9: 9680 10855744 <constantPoolKlass>
10: 9680 9358856 <instanceKlassKlass>
11: 282624 6782976
io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
12: 8137 5778112 <constantPoolCacheKlass>
13: 120 3934080 [Lscala.concurrent.forkjoin.ForkJoinTask;
14: 71166 2846640 java.util.TreeMap$Entry
15: 6425 2545712 <methodDataKlass>
16: 10308 1224792 java.lang.Class
17: 640 1140736
[Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
18: 22087 1060176 java.util.TreeMap
19: 19337 1014288 [[I
20: 16327 916376 [S
21: 17481 559392
java.util.concurrent.ConcurrentHashMap$HashEntry
22: 2235 548480 [I
23: 22000 528000
javax.management.openmbean.CompositeDataSupport
{code}
([The jmap result
screenshot|https://cloud.githubusercontent.com/assets/5170878/7465993/c9fc5b24-f30d-11e4-9276-ae635f850833.jpg])Note
the *java.util.HashMap$Entry* and *java.lang.Long* object, There are already
using about 120MB! and I found, as time goes by, the *java.util.HashMap$Entry*
and *java.lang.Long* object will occupied more and more memory, and this will
cause OOM on driver side. But I don't know what component cause this problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]