wForget commented on code in PR #9521:
URL: https://github.com/apache/incubator-gluten/pull/9521#discussion_r2080764998


##########
backends-velox/src/main/scala/org/apache/spark/sql/execution/BroadcastUtils.scala:
##########
@@ -172,19 +172,21 @@ object BroadcastUtils {
     if (filtered.isEmpty) {
       return ColumnarBatchSerializeResult.EMPTY
     }
-    val handleArray =
-      filtered.map(b => 
ColumnarBatches.getNativeHandle(BackendsApiManager.getBackendName, b))
-    val serializeResult =
-      try {
-        ColumnarBatchSerializerJniWrapper
-          .create(
-            Runtimes
+    var rowNums = 0
+    val values = filtered.map(
+      b => {
+        val handle = 
ColumnarBatches.getNativeHandle(BackendsApiManager.getBackendName, b)
+        rowNums += b.numRows()
+        try {
+          ColumnarBatchSerializerJniWrapper
+            .create(Runtimes
               .contextInstance(BackendsApiManager.getBackendName, 
"BroadcastUtils#serializeStream"))
-          .serialize(handleArray)
-      } finally {
-        filtered.foreach(ColumnarBatches.release)
-      }
-    serializeResult
+            .serialize(handle)
+        } finally {
+          ColumnarBatches.release(b)

Review Comment:
   > Did you observe a decrement of peak off-heap usage after this change? 
   
   After this change, job no longer has offheap OOM. Is this enough to indicate 
that offheap peak memory has decreased?
   
   > As shown in code, filtered is still of the type Array[ColumnarBatch] which 
may cause high usage from the beginning of the loop.
   
   Makes sense, this is indeed an improvement, I will try to convert it to an 
iterator



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to