zanmato1984 commented on code in PR #41335:
URL: https://github.com/apache/arrow/pull/41335#discussion_r1598791898
##########
cpp/src/arrow/compute/row/compare_internal.h:
##########
@@ -32,6 +32,13 @@ namespace compute {
class ARROW_EXPORT KeyCompare {
public:
+ // Clarify the max temp stack usage for CompareColumnsToRows so the caller
could reserve
+ // enough size in advance.
+ constexpr static int64_t CompareColumnsToRowsTempStackUsage(int64_t
num_rows) {
+ return (sizeof(uint8_t) + sizeof(uint8_t) + sizeof(uint8_t)) * num_rows +
Review Comment:
They correspond to:
https://github.com/apache/arrow/blob/a715ea06b71ec206a987d7921264778e9954404b/cpp/src/arrow/compute/row/compare_internal.cc#L342-L347
Will add more comment on this.
> Also, elsewhere we multiple by kMiniBatchLength but here it seems we are
adding kMiniBatchLength. Why the difference?
The idea is to keep extra `kMiniBatchLength` bytes (to cope with stack
aligning and padding) anyhow. Elsewhere using multiply is because the other
parts are all in unit of `kMiniBatchLength` but this one is special - multiple
of `num_rows` which can be arbitrary.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]