[ 
https://issues.apache.org/jira/browse/SPARK-20237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-20237:
------------------------------------

    Assignee: Apache Spark

> Spark-1.6 current and later versions of memory management issues
> ----------------------------------------------------------------
>
>                 Key: SPARK-20237
>                 URL: https://issues.apache.org/jira/browse/SPARK-20237
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
>         Environment: java 1.7.0  scala-2.10.5   maven-3.3.9    hadoop-2.2.0  
> spark-1.6.2
>            Reporter: zhangwei72
>            Assignee: Apache Spark
>            Priority: Critical
>              Labels: security
>             Fix For: 1.6.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> In spark-1.6 and later versions, there is a problem with its memory 
> management UnifiedMemoryManager.
> Spark.memory.storageFraction configuration should be at least storage Memory 
> memory.
> In the memory management UnifiedMemoryManager, the calculation of Execution 
> memory can be up to storage how much memory can borrow,using val 
> memoryReclaimableFromStorage = 
> math.max(storageMemoryPool.memoryFree,storageMemoryPool.poolSize
> - storageRegionSize).
> When storageMemoryPool.memoryFree > storageMemoryPool.poolSize - 
> storageRegionSize, the size of the a will be chosen, that is,storage Memory 
> will reduce the storageMemoryPool.memoryFree so much.
> Because of storageMemoryPool.memoryFree > storageMemoryPool.poolSize - 
> storageRegionSize, so storageMemoryPool.poolSize - 
> storageMemoryPool.memoryFree < storageRegionSize
> Now storageMemoryPool.poolSize < storageRegionSize,storageRegionSize is the 
> smallest proportion of frame definition,so there is a problem.
> To solve this problem, we define the function as  val 
> memoryReclaimableFromStorage = storageMemoryPool.poolSize - storageRegionSize.
> Experimental proof:
> I added some log information to the UnifiedMemoryManager file as follows:
> logInfo("storageMemoryPool.memoryFree 
> %f".format(storageMemoryPool.memoryFree/1024.0/1024.0))               
> logInfo("onHeapExecutionMemoryPool.memoryFree 
> %f".format(onHeapExecutionMemoryPool.memoryFree/1024.0/1024.0))             
> logInfo("storageMemoryPool.memoryUsed %f".format( 
> storageMemoryPool.memoryUsed/1024.0/1024.0))             
> logInfo("onHeapExecutionMemoryPool.memoryUsed 
> %f".format(onHeapExecutionMemoryPool.memoryUsed/1024.0/1024.0))             
> logInfo("storageMemoryPool.poolSize %f".format( 
> storageMemoryPool.poolSize/1024.0/1024.0))            
> logInfo("onHeapExecutionMemoryPool.poolSize 
> %f".format(onHeapExecutionMemoryPool.poolSize/1024.0/1024.0))
>   When I run the PageRank program, the input file for PageRank is generated 
> by the BigDataBench-Chinese Academy of Sciences and is used to evaluate large 
> data analysis system tools with a size of 676M. The information submitted is 
> as follows:
> ./bin/spark-submit --class org.apache.spark.examples.SparkPageRank \
>     --master yarn \
>     --deploy-mode cluster \
>     --num-executors 1 \
>     --driver-memory 4g \
>     --executor-memory 7g \
>     --executor-cores 6 \
>     --queue thequeue \
>     ./examples/target/scala-2.10/spark-examples-1.6.2-hadoop2.2.0.jar \
>      /test/Google_genGraph_23.txt 6
> The configuration is as follows:
> spark.memory.useLegacyMode=false
> spark.memory.fraction=0.75
> spark.memory.storageFraction=0.2
> Log information is as follows:
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> storageMemoryPool.memoryFree 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> onHeapExecutionMemoryPool.memoryFree 5663.325877
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> storageMemoryPool.memoryUsed 0.299123 M
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> onHeapExecutionMemoryPool.memoryUsed 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> storageMemoryPool.poolSize 0.299123
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager: 
> onHeapExecutionMemoryPool.poolSize 5663.325877
> According to the configuration, storageMemoryPool.poolSize at least 1G or 
> more, but the log information is only 0.299123 M, so there is an error.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to