[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

GitBox Fri, 15 Apr 2022 13:12:45 -0700


sanjibansg commented on code in PR #12755:
URL: https://github.com/apache/arrow/pull/12755#discussion_r851495892



##########
cpp/src/arrow/compute/exec/expression_benchmark.cc:
##########
@@ -69,6 +70,26 @@ static void SimplifyFilterWithGuarantee(benchmark::State& 
state, Expression filt
   }
 }
 
+static void ExecuteScalarExpressionOverhead(benchmark::State& state, 
Expression expr) {
+  const auto rows_per_batch = static_cast<int32_t>(state.range(0));
+  const auto num_batches = 10000000 / rows_per_batch;

Review Comment:
   I basically wanted to benchmark the execution with different batch sizes but 
with the same total data amount. So, if we  want to benchmark on batches with 
10 rows each, then we should have 100000 batches, thus in total we will have 
10000000 data points. Keeping the total data fix, the benchmarking tries to see 
how the execution gets affected when we execute the same amount of data but 
with batches of different sizes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] sanjibansg commented on a diff in pull request #12755: ARROW-16014: [C++] Create more benchmarks for measuring expression evaluation overhead

Reply via email to