WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055664267
##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State&
state, const Runner& ru
ArraySortFuncBenchmark(state, runner, values);
}
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const
Runner& runner,
+ int64_t min_length, int64_t
max_length) {
+ RegressionArgs args(state);
+
+ const int64_t array_size = args.size / sizeof(int64_t);
Review Comment:
So my read on the situation is that RegressionSetArgs is basically
parametrizing the test with the different CPU cache sizes plus one that exceeds
cache, and we are trying to fit an array into those sizes. That seems easy
enough to control with the primitive types, but I may be misunderstanding how
the variable length random strings would be guaranteed to fit into that same
cache even with `(min_length + max_length) / 2` without querying the size of
the array at runtime. Also unclear if I need to account for the offsets buffer
as part of the cache bounding requirement
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]