andygrove commented on code in PR #2235:
URL: https://github.com/apache/datafusion-comet/pull/2235#discussion_r2304332108


##########
spark/src/main/scala/org/apache/spark/sql/comet/execution/shuffle/NativeBatchDecoderIterator.scala:
##########
@@ -182,14 +182,22 @@ case class NativeBatchDecoderIterator(
           currentBatch = null
         }
         in.close()
+        resetDataBuf()
         isClosed = true
       }
     }
   }
 }
 
 object NativeBatchDecoderIterator {
+
+  private val INITIAL_BUFFER_SIZE = 128 * 1024
+
   private val threadLocalDataBuf: ThreadLocal[ByteBuffer] = 
ThreadLocal.withInitial(() => {
-    ByteBuffer.allocateDirect(128 * 1024)
+    ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE)
   })
+
+  private def resetDataBuf(): Unit = {
+    threadLocalDataBuf.set(ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE))

Review Comment:
   I tested locally with TPC-H q1 and saw this reallocation happen 353 times, 
but the buffer had not grown beyond the initial allocation, so this current 
implementation is adding unnecessary overhead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to