Re: [PR] GH-41334: [C++][Acero] Use per-node basis temp vector stack to mitigate overflow [arrow]

via GitHub Mon, 13 May 2024 10:08:35 -0700


zanmato1984 commented on code in PR #41335:
URL: https://github.com/apache/arrow/pull/41335#discussion_r1598791898



##########
cpp/src/arrow/compute/row/compare_internal.h:
##########
@@ -32,6 +32,13 @@ namespace compute {
 
 class ARROW_EXPORT KeyCompare {
  public:
+  // Clarify the max temp stack usage for CompareColumnsToRows so the caller 
could reserve
+  // enough size in advance.
+  constexpr static int64_t CompareColumnsToRowsTempStackUsage(int64_t 
num_rows) {
+    return (sizeof(uint8_t) + sizeof(uint8_t) + sizeof(uint8_t)) * num_rows +

Review Comment:
   They correspond to:
   
https://github.com/apache/arrow/blob/a715ea06b71ec206a987d7921264778e9954404b/cpp/src/arrow/compute/row/compare_internal.cc#L342-L347
   Will add more comment on this.
   
   > Also, elsewhere we multiple by kMiniBatchLength but here it seems we are 
adding kMiniBatchLength. Why the difference?
   
   The idea is to keep extra `kMiniBatchLength` bytes (to cope with stack 
aligning and padding) anyhow. Elsewhere using multiply is because the other 
parts are all in unit of `kMiniBatchLength` but this one is special - multiple 
of `num_rows` which can be arbitrary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-41334: [C++][Acero] Use per-node basis temp vector stack to mitigate overflow [arrow]

Reply via email to