Re: [PR] [GLUTEN-5438] feat: Dynamically sizing off-heap memory [incubator-gluten]

via GitHub Sun, 19 May 2024 18:53:08 -0700


ulysses-you commented on code in PR #5439:
URL: https://github.com/apache/incubator-gluten/pull/5439#discussion_r1606162953



##########
gluten-core/src/main/scala/org/apache/gluten/GlutenPlugin.scala:
##########
@@ -164,20 +165,71 @@ private[gluten] class GlutenDriverPlugin extends 
DriverPlugin with Logging {
     // task slots
     val taskSlots = SparkResourceUtil.getTaskSlots(conf)
 
-    // Optimistic off-heap sizes, assuming all storage memory can be borrowed 
into execution memory
-    // pool, regardless of Spark option spark.memory.storageFraction.
-    val offHeapSize = conf.getSizeAsBytes(GlutenConfig.GLUTEN_OFFHEAP_SIZE_KEY)
+    var onHeapSize: Long =
+      if (conf.contains(GlutenConfig.GLUTEN_ONHEAP_SIZE_KEY)) {
+        conf.getSizeAsBytes(GlutenConfig.GLUTEN_ONHEAP_SIZE_KEY)
+      } else {
+        // 1GB default
+        1024 * 1024 * 1024
+      }
+
+    // If dynamic off-heap sizing is enabled, the off-heap size is calculated 
based on the on-heap
+    // size. Otherwise, the off-heap size is set to the value specified by the 
user (if any).
+    // Note that this means that we will IGNORE the off-heap size specified by 
the user if the
+    // dynamic off-heap feature is enabled.
+    var offHeapSize: Long =
+      if (conf.getBoolean(GlutenConfig.GLUTEN_DYNAMIC_OFFHEAP_SIZING_ENABLED, 
false)) {
+        // Since when dynamic off-heap sizing is enabled, we commingle on-heap
+        // and off-heap memory, we set the off-heap size to the usable on-heap 
size. We will
+        // size it with a memory fraction, which can be aggressively set, but 
the default
+        // is using the same way that Spark sizes on-heap memory:
+        //
+        // spark.gluten.memory.dynamic.offHeap.sizing.memory.fraction *
+        //    (spark.executor.memory - 300MB).
+        //
+        // We will be careful to use the same configuration settings as Spark 
to ensure
+        // that we are sizing the off-heap memory in the same way as Spark 
sizes on-heap memory.
+        // The 300MB value, unfortunately, is hard-coded in Spark code.
+        ((onHeapSize - (300 * 1024 * 1024)) *
+          
conf.getDouble(GlutenConfig.GLUTEN_DYNAMIC_OFFHEAP_SIZING_MEMORY_FRACTION, 
0.6d)).toLong
+      } else if (conf.contains(GlutenConfig.GLUTEN_OFFHEAP_SIZE_KEY)) {
+        // Optimistic off-heap sizes, assuming all storage memory can be 
borrowed into execution
+        // memory pool, regardless of Spark option 
spark.memory.storageFraction.
+        conf.getSizeAsBytes(GlutenConfig.GLUTEN_OFFHEAP_SIZE_KEY)
+      } else {
+        // Default Spark Value.
+        0L
+      }
+
     conf.set(GlutenConfig.GLUTEN_OFFHEAP_SIZE_IN_BYTES_KEY, 
offHeapSize.toString)
+    conf.set(GlutenConfig.GLUTEN_OFFHEAP_SIZE_KEY, offHeapSize.toString)
+
     val offHeapPerTask = offHeapSize / taskSlots
     conf.set(GlutenConfig.GLUTEN_TASK_OFFHEAP_SIZE_IN_BYTES_KEY, 
offHeapPerTask.toString)
 
-    // Pessimistic off-heap sizes, with the assumption that all non-borrowable 
storage memory
-    // determined by spark.memory.storageFraction was used.
-    val fraction = 1.0d - conf.getDouble("spark.memory.storageFraction", 0.5d)
-    val conservativeOffHeapPerTask = (offHeapSize * fraction).toLong / 
taskSlots
-    conf.set(
-      GlutenConfig.GLUTEN_CONSERVATIVE_TASK_OFFHEAP_SIZE_IN_BYTES_KEY,
-      conservativeOffHeapPerTask.toString)
+    // If we are using dynamic off-heap sizing, we should also enable off-heap 
memory
+    // officially.
+    if (conf.getBoolean(GlutenConfig.GLUTEN_DYNAMIC_OFFHEAP_SIZING_ENABLED, 
false)) {
+      conf.set(GlutenConfig.GLUTEN_OFFHEAP_ENABLED, "true")

Review Comment:
   I try to do an explanation if nobody ack here. Author can correct me if 
anything worng.
   
   In short, the `spark.memory.offHeap.size` **is not used** if 
`spark.gluten.memory.dynamic.offHeap.sizing.enabled` is enabled.
   
   The magic of this pr did is that, it said we can leverage the power of JVM 
to control both onheap and offheap memory. We only need to configure 
`spark.executor.memory` which is the unified memory for onheap and offheap. I 
think the core idea is that, if there is enough free onhep memory, then we can 
borrow(logical borrow) it to offheap side and prevent the onheap memory to be 
allocated.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-5438] feat: Dynamically sizing off-heap memory [incubator-gluten]

Reply via email to