[
https://issues.apache.org/jira/browse/SPARK-20237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-20237:
------------------------------------
Assignee: Apache Spark
> Spark-1.6 current and later versions of memory management issues
> ----------------------------------------------------------------
>
> Key: SPARK-20237
> URL: https://issues.apache.org/jira/browse/SPARK-20237
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.6.3, 2.0.0, 2.0.1, 2.0.2, 2.1.0
> Environment: java 1.7.0 scala-2.10.5 maven-3.3.9 hadoop-2.2.0
> spark-1.6.2
> Reporter: zhangwei72
> Assignee: Apache Spark
> Priority: Critical
> Labels: security
> Fix For: 1.6.2
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> In spark-1.6 and later versions, there is a problem with its memory
> management UnifiedMemoryManager.
> Spark.memory.storageFraction configuration should be at least storage Memory
> memory.
> In the memory management UnifiedMemoryManager, the calculation of Execution
> memory can be up to storage how much memory can borrow,using val
> memoryReclaimableFromStorage =
> math.max(storageMemoryPool.memoryFree,storageMemoryPool.poolSize
> - storageRegionSize).
> When storageMemoryPool.memoryFree > storageMemoryPool.poolSize -
> storageRegionSize, the size of the a will be chosen, that is,storage Memory
> will reduce the storageMemoryPool.memoryFree so much.
> Because of storageMemoryPool.memoryFree > storageMemoryPool.poolSize -
> storageRegionSize, so storageMemoryPool.poolSize -
> storageMemoryPool.memoryFree < storageRegionSize
> Now storageMemoryPool.poolSize < storageRegionSize,storageRegionSize is the
> smallest proportion of frame definition,so there is a problem.
> To solve this problem, we define the function as val
> memoryReclaimableFromStorage = storageMemoryPool.poolSize - storageRegionSize.
> Experimental proof:
> I added some log information to the UnifiedMemoryManager file as follows:
> logInfo("storageMemoryPool.memoryFree
> %f".format(storageMemoryPool.memoryFree/1024.0/1024.0))
> logInfo("onHeapExecutionMemoryPool.memoryFree
> %f".format(onHeapExecutionMemoryPool.memoryFree/1024.0/1024.0))
> logInfo("storageMemoryPool.memoryUsed %f".format(
> storageMemoryPool.memoryUsed/1024.0/1024.0))
> logInfo("onHeapExecutionMemoryPool.memoryUsed
> %f".format(onHeapExecutionMemoryPool.memoryUsed/1024.0/1024.0))
> logInfo("storageMemoryPool.poolSize %f".format(
> storageMemoryPool.poolSize/1024.0/1024.0))
> logInfo("onHeapExecutionMemoryPool.poolSize
> %f".format(onHeapExecutionMemoryPool.poolSize/1024.0/1024.0))
> When I run the PageRank program, the input file for PageRank is generated
> by the BigDataBench-Chinese Academy of Sciences and is used to evaluate large
> data analysis system tools with a size of 676M. The information submitted is
> as follows:
> ./bin/spark-submit --class org.apache.spark.examples.SparkPageRank \
> --master yarn \
> --deploy-mode cluster \
> --num-executors 1 \
> --driver-memory 4g \
> --executor-memory 7g \
> --executor-cores 6 \
> --queue thequeue \
> ./examples/target/scala-2.10/spark-examples-1.6.2-hadoop2.2.0.jar \
> /test/Google_genGraph_23.txt 6
> The configuration is as follows:
> spark.memory.useLegacyMode=false
> spark.memory.fraction=0.75
> spark.memory.storageFraction=0.2
> Log information is as follows:
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> storageMemoryPool.memoryFree 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> onHeapExecutionMemoryPool.memoryFree 5663.325877
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> storageMemoryPool.memoryUsed 0.299123 M
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> onHeapExecutionMemoryPool.memoryUsed 0.000000
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> storageMemoryPool.poolSize 0.299123
> 17/02/28 11:07:34 INFO memory.UnifiedMemoryManager:
> onHeapExecutionMemoryPool.poolSize 5663.325877
> According to the configuration, storageMemoryPool.poolSize at least 1G or
> more, but the log information is only 0.299123 M, so there is an error.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]