bkietz commented on a change in pull request #8680:
URL: https://github.com/apache/arrow/pull/8680#discussion_r534359769
##########
File path: cpp/src/arrow/util/thread_pool_benchmark.cc
##########
@@ -103,8 +103,40 @@ static void ThreadPoolSpawn(benchmark::State& state) {
state.SetItemsProcessed(state.iterations() * nspawns);
}
+// Benchmark ThreadPool::Submit
+static void ThreadPoolSubmit(benchmark::State& state) { // NOLINT non-const
reference
+ const auto nthreads = static_cast<int>(state.range(0));
+ const auto workload_size = static_cast<int32_t>(state.range(1));
+
+ Workload workload(workload_size);
+
+ const int32_t nspawns = 10000000 / workload_size + 1;
+
+ for (auto _ : state) {
+ state.PauseTiming();
+ auto pool = *ThreadPool::Make(nthreads);
+ std::atomic<int32_t> n_finished{0};
+ state.ResumeTiming();
+
+ for (int32_t i = 0; i < nspawns; ++i) {
+ // Pass the task by reference to avoid copying it around
+ (void)DeferNotOk(pool->Submit(std::ref(workload))).Then([&](...) {
Review comment:
I agree it warrants improvement and individual benchmarking, but my
intent was to measure the end-to-end cost of using Submit+callbacks since
that's relevant to the csv parsing case
##########
File path: cpp/src/arrow/util/thread_pool_benchmark.cc
##########
@@ -136,21 +168,24 @@ static void ThreadedTaskGroup(benchmark::State& state) {
for (auto _ : state) {
auto task_group = TaskGroup::MakeThreaded(pool.get());
- for (int32_t i = 0; i < nspawns; ++i) {
- // Pass the task by reference to avoid copying it around
- task_group->Append(std::ref(task));
- }
+ task_group->Append([&task, nspawns, task_group] {
Review comment:
hold over, will clean
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]