[jira] [Comment Edited] (SPARK-11700) Memory leak at SparkContext jobProgressListener stageIdToData map

Kostas papageorgopoulos (JIRA) Mon, 16 Nov 2015 08:53:59 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-11700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006895#comment-15006895
 ]


Kostas papageorgopoulos edited comment on SPARK-11700 at 11/16/15 4:42 PM:
---------------------------------------------------------------------------

One workaround to minimize the effect is to Keep the JavaSparkContext forever 
alive. (Never stop it inside a JVM process that is long running ) and configure 
the following options {code} 
OPTION                         DEFAULT DESCRIPTION  
spark.ui.retainedJobs   1000    How many jobs the Spark UI and status APIs 
remember before garbage collecting.
spark.ui.retainedStages 1000 How many stages the Spark UI and status APIs 
remember before garbage collecting.
spark.worker.ui.retainedExecutors       1000    How many finished executors the 
Spark UI and status APIs remember before garbage collecting.
spark.worker.ui.retainedDrivers 1000    How many finished drivers the Spark UI 
and status APIs remember before garbage collecting.
spark.sql.ui.retainedExecutions 1000    How many finished executions the Spark 
UI and status APIs remember before garbage collecting.
spark.streaming.ui.retainedBatches      1000    How many finished batches the 
Spark UI and status APIs remember before garbage collecting.{code}  to very 
small numbers  in order to have the {code}JobProgressListener{code} relevant 
maps cleaned. 


was (Author: p02096):
One workaround to minimize the effect is to Keep the JavaSparkContext forever 
alive. (Never stop it inside a JVM process that is long running ) and configure 
the following options {code} 
spark.ui.retainedJobs   1000    How many jobs the Spark UI and status APIs 
remember before garbage collecting.
spark.ui.retainedStages 1000 How many stages the Spark UI and status APIs 
remember before garbage collecting.
spark.worker.ui.retainedExecutors       1000    How many finished executors the 
Spark UI and status APIs remember before garbage collecting.
spark.worker.ui.retainedDrivers 1000    How many finished drivers the Spark UI 
and status APIs remember before garbage collecting.
spark.sql.ui.retainedExecutions 1000    How many finished executions the Spark 
UI and status APIs remember before garbage collecting.
spark.streaming.ui.retainedBatches      1000    How many finished batches the 
Spark UI and status APIs remember before garbage collecting.{code}  to very 
small numbers  in order to have the {code}JobProgressListener{code} relevant 
maps cleaned. 

> Memory leak at SparkContext jobProgressListener stageIdToData map
> -----------------------------------------------------------------
>
>                 Key: SPARK-11700
>                 URL: https://issues.apache.org/jira/browse/SPARK-11700
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core, SQL
>    Affects Versions: 1.5.0, 1.5.1, 1.5.2
>         Environment: Ubuntu 14.04 LTS, Oracle JDK 1.8.51 Apache tomcat 
> 8.0.28. Spring 4
>            Reporter: Kostas papageorgopoulos
>            Priority: Minor
>              Labels: leak, memory-leak
>         Attachments: AbstractSparkJobRunner.java, 
> SparkContextPossibleMemoryLeakIDEA_DEBUG.png, SparkHeapSpaceProgress.png, 
> SparkMemoryAfterLotsOfConsecutiveRuns.png, 
> SparkMemoryLeakAfterLotsOfRunsWithinTheSameContext.png
>
>
> it seems that there is  A SparkContext jobProgressListener memory leak.*. 
> Bellow i describe the  steps i do to reproduce that. 
> I have created a java webapp trying to abstractly Run some Spark Sql jobs 
> that read data from HDFS (join them) and Write them To ElasticSearch using ES 
> hadoop connector. After a Lot of consecutive runs  i noticed that my heap 
> space was full so i got an out of heap space error.
> At the attached file {code} AbstractSparkJobRunner {code} the {code}  public 
> final void run(T jobConfiguration, ExecutionLog executionLog) throws 
> Exception  {code} runs each time an Spark Sql Job is triggered.  So tried to 
> reuse the same SparkContext for a number of consecutive runs. If some rules 
> apply i try to clean up the SparkContext by first calling {code} 
> killSparkAndSqlContext {code}. This code eventually runs {code}  synchronized 
> (sparkContextThreadLock) {
>             if (javaSparkContext != null) {
>                 LOGGER.info("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CLEARING SPARK 
> CONTEXT!!!!!!!!!!!!!!!!!!!!!!!!!!!");
>                 javaSparkContext.stop();
>                 javaSparkContext = null;
>                 sqlContext = null;
>                 System.gc();
>             }
>             numberOfRunningJobsForSparkContext.getAndSet(0);
>         }
> {code}.
> So at some point in time i suppose that if no other SparkSql job should run i 
> should kill the sparkContext  (The 
> AbstractSparkJobRunner.killSparkAndSqlContext  runs) and this should be 
> garbage collected from garbage collector. However this is not the case, Even 
> if in my debugger shows that my JavaSparkContext object is null see attached 
> picture {code} SparkContextPossibleMemoryLeakIDEA_DEBUG.png {code}.
> The jvisual vm shows an incremental heap space even when the garbage 
> collector is called. See attached picture {code} SparkHeapSpaceProgress.png 
> {code}.
> The memory analyser Tool shows that a big part of the retained heap to be 
> assigned to _jobProgressListener see attached picture {code} 
> SparkMemoryAfterLotsOfConsecutiveRuns.png {code}  and summary picture {code} 
> SparkMemoryLeakAfterLotsOfRunsWithinTheSameContext.png {code}. Although at 
> the same time in Singleton Service the JavaSparkContext is null.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-11700) Memory leak at SparkContext jobProgressListener stageIdToData map

Reply via email to