tgravescs commented on a change in pull request #35504:
URL: https://github.com/apache/spark/pull/35504#discussion_r816843877



##########
File path: docs/configuration.md
##########
@@ -287,6 +297,15 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.executor.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of executor memory to be allocated as additional non-heap memory 
per executor process.

Review comment:
       same thing here, have descriptions match config and combine

##########
File path: docs/configuration.md
##########
@@ -198,6 +198,16 @@ of the most common options to set are:
   </td>
   <td>2.3.0</td>
 </tr>
+<tr>
+  <td><code>spark.driver.memoryOverheadFactor</code></td>
+  <td>0.10</td>
+  <td>
+    Fraction of driver memory to be allocated as additional non-heap memory 
per driver process

Review comment:
       this text doesn't match the config text, I would expect them to be the 
same.  I think combining them would be good, the description of the config is 
more clear but this one also add the "in cluster mode" and the precendence 
line. 

##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -105,6 +105,17 @@ package object config {
     .bytesConf(ByteUnit.MiB)
     .createOptional
 
+  private[spark] val DRIVER_MEMORY_OVERHEAD_FACTOR =
+    ConfigBuilder("spark.driver.memoryOverheadFactor")
+      .doc("This sets the Memory Overhead Factor on the driver that will 
allocate memory to " +
+        "non-JVM memory, which includes off-heap memory allocations, non-JVM 
tasks, various " +
+        "systems processes, and tmpfs-based local directories.")

Review comment:
       so this doesn't cover the weirdness on k8s of the non-jvm jobs 
defaulting to 0.4. I also don't see that the Kubernetes 
spark.kubernetes.memoryOverheadFactor.

##########
File path: 
resource-managers/kubernetes/core/src/test/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStepSuite.scala
##########
@@ -188,21 +188,21 @@ class BasicDriverFeatureStepSuite extends SparkFunSuite {
   // Memory overhead tests. Tuples are:
   //   test name, main resource, overhead factor, expected factor
   Seq(
-    ("java", JavaMainAppResource(None), None, 
MEMORY_OVERHEAD_FACTOR.defaultValue.get),
+    ("java", JavaMainAppResource(None), None, 
DRIVER_MEMORY_OVERHEAD_FACTOR.defaultValue.get),

Review comment:
       is there a test to make sure the fallback to the original config works?

##########
File path: 
resource-managers/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
##########
@@ -706,4 +706,18 @@ class YarnAllocatorSuite extends SparkFunSuite with 
Matchers with BeforeAndAfter
       sparkConf.set(MEMORY_OFFHEAP_SIZE, originalOffHeapSize)
     }
   }
+
+  test("SPARK-38194: Configurable memory overhead factor") {
+    val executorMemory = sparkConf.get(EXECUTOR_MEMORY).toInt
+    try {
+      sparkConf.set(EXECUTOR_MEMORY_OVERHEAD_FACTOR, 0.5)

Review comment:
       do we test precedence of spark.driver/executor.memoryOverhead working 
over the overhead factor?  If not can we add a test.
   
   
   
   <br class="Apple-interchange-newline" style="caret-color: rgb(0, 0, 0); 
color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; 
font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; 
text-indent: 0px; text-transform: none; white-space: normal; widows: auto; 
word-spacing: 0px; -webkit-text-size-adjust: auto; -webkit-text-stroke-width: 
0px; text-decoration: none;">

##########
File path: 
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
##########
@@ -280,9 +280,12 @@ private[yarn] class YarnAllocator(
       // track the resource profile if not already there
       getOrUpdateRunningExecutorForRPId(rp.id)
       logInfo(s"Resource profile ${rp.id} doesn't exist, adding it")
+
+      val memoryOverheadFactor = sparkConf.get(EXECUTOR_MEMORY_OVERHEAD_FACTOR)

Review comment:
       we are getting this on every call, just make a class variable and get it 
once




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to