andygrove commented on code in PR #2235: URL: https://github.com/apache/datafusion-comet/pull/2235#discussion_r2304332108
########## spark/src/main/scala/org/apache/spark/sql/comet/execution/shuffle/NativeBatchDecoderIterator.scala: ########## @@ -182,14 +182,22 @@ case class NativeBatchDecoderIterator( currentBatch = null } in.close() + resetDataBuf() isClosed = true } } } } object NativeBatchDecoderIterator { + + private val INITIAL_BUFFER_SIZE = 128 * 1024 + private val threadLocalDataBuf: ThreadLocal[ByteBuffer] = ThreadLocal.withInitial(() => { - ByteBuffer.allocateDirect(128 * 1024) + ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE) }) + + private def resetDataBuf(): Unit = { + threadLocalDataBuf.set(ByteBuffer.allocateDirect(INITIAL_BUFFER_SIZE)) Review Comment: I tested locally with TPC-H q1 and saw this reallocation happen 353 times, but the buffer had not grown beyond the initial allocation, so this current implementation is adding unnecessary overhead -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org