LuciferYang commented on a change in pull request #25309:
[SPARK-28577][YARN]Resource capability requested for each executor add
offHeapMemorySize
URL: https://github.com/apache/spark/pull/25309#discussion_r310355911
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala
##########
@@ -184,4 +184,29 @@ object YarnSparkHadoopUtil {
ConverterUtils.toContainerId(containerIdString)
}
+ /**
+ * If MEMORY_OFFHEAP_ENABLED is true, we should ensure
executorOverheadMemory requested value
+ * is not less than MEMORY_OFFHEAP_SIZE, otherwise the memory resource
requested for executor
+ * may be not enough.
+ */
+ def executorMemoryOverheadRequested(sparkConf: SparkConf): Int = {
+ val executorMemory = sparkConf.get(EXECUTOR_MEMORY).toInt
+ val overhead = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD).getOrElse(
+ math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toInt,
MEMORY_OVERHEAD_MIN)).toInt
+ val offHeap = if (sparkConf.get(MEMORY_OFFHEAP_ENABLED)) {
+ val size =
+ sparkConf.getSizeAsMb(MEMORY_OFFHEAP_SIZE.key,
MEMORY_OFFHEAP_SIZE.defaultValueString)
+ require(size > 0,
+ s"${MEMORY_OFFHEAP_SIZE.key} must be > 0 when
${MEMORY_OFFHEAP_ENABLED.key} == true")
+ if (size > overhead) {
+ logWarning(s"The value of ${MEMORY_OFFHEAP_SIZE.key}(${size}MB) will
be used as " +
+ s"executorMemoryOverhead to request resource to ensure that Executor
has enough memory " +
+ s"to use. It is recommended that the configuration value of " +
+ s"${EXECUTOR_MEMORY_OVERHEAD.key} should be no less than
${MEMORY_OFFHEAP_SIZE.key} " +
+ s"when ${MEMORY_OFFHEAP_ENABLED.key} is true.")
+ }
+ size
+ } else 0
+ math.max(overhead, offHeap).toInt
Review comment:
@beliefer I know , but now pysparkWorkerMemory already PysparkWorkerMemory
is already configured independently use `spark.executor.pyspark.memory` after
[21977](https://github.com/apache/spark/pull/21977), now memoryOverhead only
contanis `offHeapMemory ` and `otherMemory`.
before
[c0ef860](https://github.com/apache/spark/pull/25309/commits/c0ef86050b69d34b6f93b569219118e99ec0dd7b)
the approach is `if user config offHeapSize > memoryOverhead, use offHeapSize
instead of memoryOverhead to request resource and print a warn log to tell user
should config memoryOverhead more than offHeapSize`, there is still no
guarantee of request enough memory resource because of the lack of
`otherMemory`, and tt's hard to confirm the size of `otherMemory`.
Can we consider `if user config offHeapSize > memoryOverhead, use
offHeapSize+384m instead of memoryOverhead to request resource`, `384m is
MEMORY_OVERHEAD_MIN`,
Can we assume that?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]