[ https://issues.apache.org/jira/browse/SPARK-27648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16837078#comment-16837078 ]
tommy duan edited comment on SPARK-27648 at 5/10/19 9:41 AM: ------------------------------------------------------------- Hi [~gsomogyi] & [~kabhwan] I have filtered the fields (*timestamp,numRowsTotal,numRowsUpdated,memoryUsedBytes*) from this progress log for details. Please check Attachments(([^houragg_filter.csv]) Then, according to the time stamp, the state occupied memory (*memoryUsedBytes*), the number of updated records (*numRowsUpdated*), and the total number of stored records (*numRowsTotal*) are iconized. The processed graphics are shown as follows: !image-2019-05-10-17-18-25-051.png! After this analysis, we can known the state memory occupancy (*memoryUsedBytes*: from the progresses log) does not fluctuate much (basically stable), but from *"SPARK UI*" - > "*Executors Tab*", we can see that "*Storage Memory*" does increase over time. {color:#ff0000}*Please note that:*{color} {color:#59afe1}1) The log file([^houragg(1).out]&[^houragg_filter.csv]) above is from 2019-04-23 to 2019-04-29. In fact, I ran from 2019-04-23 to 2019-05-10, but the log of 2019-04-29 was lost.{color} {color:#59afe1}2) The positive thing is that "Storage Memory" has been increasing from spark UI - > "executor-tab".{color} |{color:#59afe1}*TimeStamp*{color}|{color:#59afe1}*Run-time(hour)*{color}|{color:#59afe1}*Storage Memory size(MB)*{color}|{color:#59afe1}*Memory growth rate(MB/hour)*{color}| |{color:#59afe1}2019-04-23{color}|{color:#59afe1}0H{color}|{color:#59afe1}0MB/1.5GB{color}|{color:#59afe1}0{color}| |{color:#59afe1}2019-04-24{color}|{color:#59afe1}23.5H{color}|{color:#59afe1}41.6MB/1.5GB{color}|{color:#59afe1}1.770212766{color}| |{color:#59afe1}2019-04-28{color}|{color:#59afe1}108.4H{color}|{color:#59afe1}460.2MB/1.5GB{color}|{color:#59afe1}4.245387454{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}131.7H{color}|{color:#59afe1}559.1MB/1.5GB{color}|{color:#59afe1}4.245254366{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}135.4H{color}|{color:#59afe1}575MB/1.5GB{color}|{color:#59afe1}4.246676514{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}153.6H{color}|{color:#59afe1}641.2MB/1.5GB{color}|{color:#59afe1}4.174479167{color}| |{color:#59afe1}2019-05-02{color}|{color:#59afe1}219H{color}|{color:#59afe1}888.1MB/1.5GB{color}|{color:#59afe1}4.055251142{color}| |{color:#59afe1}..{color}|{color:#59afe1}263H{color}|{color:#59afe1}1126.4MB/1.5GB{color}|{color:#59afe1}4.282889734{color}| |{color:#59afe1}..{color}|{color:#59afe1}309H{color}|{color:#59afe1}1228.8MB/1.5GB{color}|{color:#59afe1}3.976699029{color}| was (Author: yy3b2007com): Hi [~gsomogyi] & [~kabhwan] I have filtered the fields (*timestamp,numRowsTotal,numRowsUpdated,memoryUsedBytes*) from this progress log for details. Please check Attachments(([^houragg_filter.csv]) Then, according to the time stamp, the state occupied memory (*memoryUsedBytes*), the number of updated records (*numRowsUpdated*), and the total number of stored records (*numRowsTotal*) are iconized. The processed graphics are shown as follows: !image-2019-05-10-17-18-25-051.png! After this analysis, we can known the state memory occupancy (*memoryUsedBytes*: from the progresses log) does not fluctuate much (basically stable), but from *"SPARK UI*" - > "*Executors Tab*", we can see that "*Storage Memory*" does increase over time. {color:#FF0000}*Please note that:*{color} {color:#59afe1}1) The log file above is from 2019-04-23 to 2019-04-29. In fact, I ran from 2019-04-23 to 2019-05-10, but the log of 2019-04-29 was lost.{color} {color:#59afe1}2) The positive thing is that "Storage Memory" has been increasing from spark UI - > "executor-tab".{color} |{color:#59afe1}*TimeStamp*{color}|{color:#59afe1}*Run-time(hour)*{color}|{color:#59afe1}*Storage Memory size(MB)*{color}|{color:#59afe1}*Memory growth rate(MB/hour)*{color}| |{color:#59afe1}2019-04-23{color}|{color:#59afe1}0H{color}|{color:#59afe1}0MB/1.5GB{color}|{color:#59afe1}0{color}| |{color:#59afe1}2019-04-24{color}|{color:#59afe1}23.5H{color}|{color:#59afe1}41.6MB/1.5GB{color}|{color:#59afe1}1.770212766{color}| |{color:#59afe1}2019-04-28{color}|{color:#59afe1}108.4H{color}|{color:#59afe1}460.2MB/1.5GB{color}|{color:#59afe1}4.245387454{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}131.7H{color}|{color:#59afe1}559.1MB/1.5GB{color}|{color:#59afe1}4.245254366{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}135.4H{color}|{color:#59afe1}575MB/1.5GB{color}|{color:#59afe1}4.246676514{color}| |{color:#59afe1}2019-04-29{color}|{color:#59afe1}153.6H{color}|{color:#59afe1}641.2MB/1.5GB{color}|{color:#59afe1}4.174479167{color}| |{color:#59afe1}2019-05-02{color}|{color:#59afe1}219H{color}|{color:#59afe1}888.1MB/1.5GB{color}|{color:#59afe1}4.055251142{color}| |{color:#59afe1}..{color}|{color:#59afe1}263H{color}|{color:#59afe1}1126.4MB/1.5GB{color}|{color:#59afe1}4.282889734{color}| |{color:#59afe1}..{color}|{color:#59afe1}309H{color}|{color:#59afe1}1228.8MB/1.5GB{color}|{color:#59afe1}3.976699029{color}| > In Spark2.4 Structured Streaming:The executor storage memory increasing over > time > --------------------------------------------------------------------------------- > > Key: SPARK-27648 > URL: https://issues.apache.org/jira/browse/SPARK-27648 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 2.4.0 > Reporter: tommy duan > Priority: Major > Attachments: houragg(1).out, houragg_filter.csv, > image-2019-05-09-17-51-14-036.png, image-2019-05-10-17-18-25-051.png > > > *Spark Program Code Business:* > Read the topic on kafka, aggregate the stream data sources, and then output > it to another topic line of kafka. > *Problem Description:* > *1) Using spark structured streaming in CDH environment (spark 2.2)*, memory > overflow problems often occur (because of too many versions of state stored > in memory, this bug has been modified in spark 2.4). > {code:java} > /spark-submit \ > --conf “spark.yarn.executor.memoryOverhead=4096M” > --num-executors 15 \ > --executor-memory 3G \ > --executor-cores 2 \ > --driver-memory 6G \{code} > {code} > Executor memory exceptions occur when running with this submit resource under > SPARK 2.2 and the normal running time does not exceed one day. > The solution is to set the executor memory larger than before > {code:java} > My spark-submit script is as follows: > /spark-submit\ > conf "spark. yarn. executor. memoryOverhead = 4096M" > num-executors 15\ > executor-memory 46G\ > executor-cores 3\ > driver-memory 6G\ > ...{code} > In this case, the spark program can be guaranteed to run stably for a long > time, and the executor storage memory is less than 10M (it has been running > stably for more than 20 days). > *2) From the upgrade information of Spark 2.4, we can see that the problem of > large memory consumption of state storage has been solved in Spark 2.4.* > So we upgraded spark to SPARK 2.4 under CDH, tried to run the spark program, > and found that the use of memory was reduced. > But a problem arises, as the running time increases, the storage memory of > executor is growing (see Executors - > Storage Memory from the Spark on Yarn > Resource Manager UI). > This program has been running for 14 days (under SPARK 2.2, running with > this submit resource, the normal running time is not more than one day, > Executor memory abnormalities will occur). > The script submitted by the program under spark2.4 is as follows: > {code:java} > /spark-submit \ > --conf “spark.yarn.executor.memoryOverhead=4096M” > --num-executors 15 \ > --executor-memory 3G \ > --executor-cores 2 \ > --driver-memory 6G > {code} > Under Spark 2.4, I counted the size of executor memory as time went by during > the running of the spark program: > |Run-time(hour)|Storage Memory size(MB)|Memory growth rate(MB/hour)| > |23.5H|41.6MB/1.5GB|1.770212766| > |108.4H|460.2MB/1.5GB|4.245387454| > |131.7H|559.1MB/1.5GB|4.245254366| > |135.4H|575MB/1.5GB|4.246676514| > |153.6H|641.2MB/1.5GB|4.174479167| > |219H|888.1MB/1.5GB|4.055251142| > |263H|1126.4MB/1.5GB|4.282889734| > |309H|1228.8MB/1.5GB|3.976699029| -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org