zanmato1984 commented on code in PR #40484:
URL: https://github.com/apache/arrow/pull/40484#discussion_r1539793957
##########
cpp/src/arrow/compute/key_hash.cc:
##########
@@ -378,24 +378,43 @@ void Hashing32::HashFixed(int64_t hardware_flags, bool
combine_hashes, uint32_t
}
}
-void Hashing32::HashMultiColumn(const std::vector<KeyColumnArray>& cols,
+Status Hashing32::HashMultiColumn(const std::vector<KeyColumnArray>& cols,
LightContext* ctx, uint32_t* hashes) {
uint32_t num_rows = static_cast<uint32_t>(cols[0].length());
constexpr uint32_t max_batch_size = util::MiniBatch::kMiniBatchLength;
+ const auto alloc_batch_size = std::min(num_rows, max_batch_size);
+
+ // pre calculate alloc size in TempVectorStack for hash_temp_buf,
null_hash_temp_buf
+ // and null_indices_buf
+ const auto alloc_hash_temp_buf =
+ util::TempVectorStack::EstimateAllocSize(alloc_batch_size *
sizeof(uint32_t));
+ const auto alloc_for_null_indices_buf =
+ util::TempVectorStack::EstimateAllocSize(alloc_batch_size *
sizeof(uint16_t));
+ const auto alloc_size = alloc_hash_temp_buf * 2 + alloc_for_null_indices_buf;
+
+ std::shared_ptr<util::TempVectorStack> temp_stack(nullptr);
Review Comment:
Point is you don't really have to set the `temp_stack` pointer into the
`ctx`. Just a regular temp variable will do. So you don't have to clear
`ctx->stack` at the end.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]