tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r816843877
##########
File path: docs/configuration.md
##########
@@ -287,6 +297,15 @@ of the most common options to set are:
</td>
<td>2.3.0</td>
</tr>
+<tr>
+ <td><code>spark.executor.memoryOverheadFactor</code></td>
+ <td>0.10</td>
+ <td>
+ Fraction of executor memory to be allocated as additional non-heap memory
per executor process.
Review comment:
same thing here, have descriptions match config and combine
##########
File path: docs/configuration.md
##########
@@ -198,6 +198,16 @@ of the most common options to set are:
</td>
<td>2.3.0</td>
</tr>
+<tr>
+ <td><code>spark.driver.memoryOverheadFactor</code></td>
+ <td>0.10</td>
+ <td>
+ Fraction of driver memory to be allocated as additional non-heap memory
per driver process
Review comment:
this text doesn't match the config text, I would expect them to be the
same. I think combining them would be good, the description of the config is
more clear but this one also add the "in cluster mode" and the precendence
line.
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
.bytesConf(ByteUnit.MiB)
.createOptional
+ private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+ ConfigBuilder("spark.driver.memoryOverheadFactor")
+ .doc("This sets the Memory Overhead Factor on the driver that will
allocate memory to " +
+ "non-JVM memory, which includes off-heap memory allocations, non-JVM
tasks, various " +
+ "systems processes, and tmpfs-based local directories.")
Review comment:
so this doesn't cover the weirdness on k8s of the non-jvm jobs
defaulting to 0.4. I also don't see that the Kubernetes
spark.kubernetes.memoryOverheadFactor.
##########
File path:
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -188,21 +188,21 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
// Memory overhead tests. Tuples are:
// test name, main resource, overhead factor, expected factor
Seq(
- ("java", JavaMainAppResource(None), None,
MEMORY_OVERHEAD_FACTOR.defaultValue.get),
+ ("java", JavaMainAppResource(None), None,
DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get),
Review comment:
is there a test to make sure the fallback to the original config works?
##########
File path:
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
##########
@@ -706,4 +706,18 @@ class YarnAllocatorSuite extends SparkFunSuite with
Matchers with BeforeAndAfter
sparkConf.set(MEMORY_OFFHEAP_SIZE, originalOffHeapSize)
}
}
+
+ test("SPARK-38194: Configurable memory overhead factor") {
+ val executorMemory = sparkConf.get(EXECUTOR_MEMORY).toInt
+ try {
+ sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.5)
Review comment:
do we test precedence of spark.driver/executor.memoryOverhead working
over the overhead factor? If not can we add a test.
<br class="Apple-interchange-newline" style="caret-color: rgb(0, 0, 0);
color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal;
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start;
text-indent: 0px; text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width:
0px; text-decoration: none;">
##########
File path:
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
##########
@@ -280,9 +280,12 @@ private[yarn] class YarnAllocator(
// track the resource profile if not already there
getOrUpdateRunningExecutorForRPId(rp.id)
logInfo(s"Resource profile ${rp.id} doesn't exist, adding it")
+
+ val memoryOverheadFactor = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)
Review comment:
we are getting this on every call, just make a class variable and get it
once
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]