Nikita Gorbachevski created SPARK-28709:
-------------------------------------------
Summary: Memory leaks after stopping of StreamingContext
Key: SPARK-28709
URL: https://issues.apache.org/jira/browse/SPARK-28709
Project: Spark
Issue Type: Bug
Components: DStreams
Affects Versions: 2.4.3
Reporter: Nikita Gorbachevski
In my application spark streaming is restarted programmatically by stopping
StreamingContext without stopping of SparkContext and creating/starting a new
one . I use it for automatic detection of Kafka topic/partition changes and
automatic failover in case of non fatal exceptions.
However i notice that after multiple restarts driver fails with OOM. During
investigation of heap dump i figured out that StreamingContext object isn't
cleared by GC after stopping.
There are several places which holds reference to it :
# StreamingTab registers StreamingJobProgressListener which holds reference to
Streaming Context directly to LiveListenerBus shared queue via
ssc.sc.addSparkListener(listener) method invocation. However this listener
isn't unregistered at stop method, moreover the same listener is registered via
ssc.addStreamingListener(listener) one line above so i assume this line could
just be removed.
# json handlers (/streaming/json and /streaming/batch/json) aren't
unregistered in SparkUI, while they hold reference to
StreamingJobProgressListener. Basically the same issue affects all the pages, i
assume that renderJsonHandler should be added to pageToHandlers cache on
attachPage method invocation in order to unregistered it as well on detachPage.
# SparkUi holds reference to StreamingJobProgressListener in the corresponding
local variable which isn't cleared after stopping of StreamingContext.
After i applied these changes via reflection in my app OOM on driver side gone.
I will submit a pull request to fix the mentioned issues.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]