dianaclarke commented on a change in pull request #11766: URL: https://github.com/apache/arrow/pull/11766#discussion_r756324865
########## File path: dev/archery/archery/cli.py ########## @@ -494,7 +494,12 @@ def benchmark_run(ctx, rev_or_path, src, preserve, output, cmake_extras, repetitions=repetitions, benchmark_filter=benchmark_filter) - json.dump(runner_base, output, cls=JsonEncoder) + # XXX for some weird reason, running the benchmarks is coupled + # with JSON-encoding their results, so need to run `json` on + # the benchmark runner even when no JSON output is requested. Review comment: > XXX for some weird reason It took me a while to figure it out, but: - It looks like the benchmarks aren't actually executed until an attempt to serialize them is made Which is triggered by the custom encoder `JsonEncoder` used here: - `json.dump(runner_base, output, cls=JsonEncoder)` - https://github.com/apache/arrow/blob/master/dev/archery/archery/benchmark/codec.py Before that, the variable `runner_base` hasn't done any real work – it's only been instantiated waiting to do some work, and the `JsonEncoder` drives that work. ``` runner_base = CppBenchmarkRunner.from_rev_or_path( src, root, rev_or_path, conf, repetitions=repetitions, suite_filter=suite_filter, benchmark_filter=benchmark_filter) ``` This is why you can't just do the following. As you probably already noted, this will return neither the text table output nor the json output (if `--output` isn't provided). ``` if output is not None: json.dump(runner_base, output, cls=JsonEncoder) ``` Hope that helps. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org