jinchengchenghh opened a new pull request, #11758: URL: https://github.com/apache/gluten/pull/11758
Cache the batch in cpu cache, and wait for the join threads to fetch one by one, the buffer size is controlled by `spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes` temporally, the size may be changed by the remaining memory in the server. Test: Test in local SF100, adjust the config to enable caching batch. ``` --conf spark.gluten.sql.columnar.backend.velox.cudf.batchSize=10000 \ --conf spark.gluten.sql.columnar.backend.velox.cudf.shuffleMaxPrefetchBytes=1024MB ``` The log prints `Prefetched 171 batches (24057900 bytes) before blocking on GPU lock` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
