WangGuangxin opened a new pull request, #9035: URL: https://github.com/apache/incubator-gluten/pull/9035
## What changes were proposed in this pull request? We found that ColumnarPartialProject is very batch sensitive, that for small input batches, the performance may more slower thant total fallback. But if we increase the batch size, there may have significate performance improvements. One example in our case Before append `VeloxResizeBatches` (avg batch size is 26,341,825,212/2,775,903,310 = 9 row) <img width="870" alt="image" src="https://github.com/user-attachments/assets/95cca983-17ef-4c7d-bfc6-e4499d0aa925" /> After append `VeloxResizeBatches` (avg batch size is 26,341,825,212/25,089,811 = 1049 row) <img width="683" alt="image" src="https://github.com/user-attachments/assets/06cbb146-45bb-499f-b97b-56f8f1c64743" /> Though there are cost to combine batches, but the whole stage's task time reduced from 2306h to 1029h (Fixes: \#9034) ## How was this patch tested? manually -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@gluten.apache.org For additional commands, e-mail: commits-h...@gluten.apache.org