WillAyd commented on code in PR #15041:
URL: https://github.com/apache/arrow/pull/15041#discussion_r1055664267


##########
cpp/src/arrow/compute/kernels/vector_sort_benchmark.cc:
##########
@@ -106,6 +125,19 @@ static void ArraySortFuncBoolBenchmark(benchmark::State& 
state, const Runner& ru
   ArraySortFuncBenchmark(state, runner, values);
 }
 
+template <typename Runner>
+static void ArraySortFuncStringBenchmark(benchmark::State& state, const 
Runner& runner,
+                                         int64_t min_length, int64_t 
max_length) {
+  RegressionArgs args(state);
+
+  const int64_t array_size = args.size / sizeof(int64_t);

Review Comment:
   So my read on the situation is that RegressionSetArgs is basically 
parametrizing the test with the different CPU cache sizes plus one that exceeds 
cache, and we are trying to fit an array into those sizes. That seems easy 
enough to control with the primitive types, but I may be misunderstanding how 
the variable length random strings would be guaranteed to fit into that same 
cache even with `(min_length + max_length) / 2` without querying the size of 
the array at runtime. Also unclear if I need to account for the offsets buffer 
as part of the cache bounding requirement



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to