Re: [PR] [GLUTEN-12235][VL] Fix hash shuffle spill ordering when evicting partition buffers [gluten]

via GitHub Thu, 04 Jun 2026 06:24:32 -0700


Copilot commented on code in PR #12231:
URL: https://github.com/apache/gluten/pull/12231#discussion_r3356291571



##########
cpp/velox/shuffle/VeloxHashShuffleWriter.cc:
##########
@@ -1514,22 +1513,33 @@ arrow::Result<int64_t> 
VeloxHashShuffleWriter::evictPartitionBuffersMinSize(int6
     pidToSize.emplace_back(pid, partitionBufferSize_[pid]);
   }
   std::sort(pidToSize.begin(), pidToSize.end(), [&](const auto& a, const auto& 
b) { return a.second > b.second; });
-  if (!pidToSize.empty()) {
-    for (auto& item : pidToSize) {
-      auto pid = item.first;
-      ARROW_ASSIGN_OR_RAISE(auto buffers, assembleBuffers(pid, false));
-      auto* types = partitionWriter_->enableTypeAwareCompress() ? 
&tacBufferTypes_ : nullptr;
-      auto payload = std::make_unique<InMemoryPayload>(
-          item.second, &isValidityBuffer_, schema_, std::move(buffers), 
hasComplexType_, types);
-      metrics_.totalBytesToEvict += payload->rawSize();
-      RETURN_NOT_OK(partitionWriter_->hashEvict(pid, std::move(payload), 
Evict::kSpill, false, writtenBytes_));
-      evicted = beforeEvict - partitionBufferPool_->bytes_allocated();
-      if (evicted >= size) {
-        break;
-      }
+
+  struct PartitionPayload {
+    uint32_t pid;
+    std::unique_ptr<InMemoryPayload> payload;
+  };
+  std::vector<PartitionPayload> selectedPayloads;
+  int64_t selectedBytes = 0;
+  for (auto& item : pidToSize) {
+    auto pid = item.first;
+    ARROW_ASSIGN_OR_RAISE(auto buffers, assembleBuffers(pid, false));

Review Comment:
   `evictPartitionBuffersMinSize()` now assembles buffers and materializes 
`InMemoryPayload`s for all selected partitions before calling `hashEvict()`. 
Since `assembleBuffers()` can allocate extra memory (e.g., complex-type flush 
buffer allocation/reserve) and none of the selected partition buffers are 
released until the second loop, this can increase peak memory usage and raise 
OOM risk compared to the previous per-partition spill loop (which could free 
memory incrementally and stop early). Consider selecting only partition IDs in 
the first pass, then sorting by pid and doing `assembleBuffers()+hashEvict()` 
in pid order so memory can be released as each partition is spilled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [GLUTEN-12235][VL] Fix hash shuffle spill ordering when evicting partition buffers [gluten]

Reply via email to