marin-ma opened a new pull request, #6727: URL: https://github.com/apache/incubator-gluten/pull/6727
During sort-based shuffle c2r conversion, 64k buffers are allocated to hold the row data. A new 64k buffer will be allocated when the rest space is not enough to hold one more row. However, when allocating a Velox buffer, the capacity can be 1.5x larger than the size, leading to huge memory waste. ``` W0806 09:47:47.876497 526671 VeloxSortShuffleWriter.cc:323] acquire new buffer. current capacity: 100663200, size: 67108864, pageCursor: 67108861, unused: 33554339 ``` This PR sets the allocated buffer size to its capacity to minimize memory waste: ``` W0806 10:38:23.872462 552313 VeloxSortShuffleWriter.cc:323] acquire new buffer. current capacity: 100663200, size: 100663200, pageCursor: 100663199, unused: 1 ``` There will be less spill with this patch when executor memory is low. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
