Thanks for those great instructions, QP! I've spiked adding the arrow-rs and arrow-datafusion benchmark results to the Arrow Conbench server.
https://github.com/ursacomputing/benchmarks/pull/79/files And I've published one example run for arrow-datafusion: - Example run: https://conbench.ursa.dev/runs/62098d2f86314339a696f8c48c4ce2e7/ - Example benchmark from run: https://conbench.ursa.dev/benchmarks/aa841cdf78ef4e38832e075b8485ec59/ Some observations along the way: - Criterion results are in nanoseconds, but the smallest unit Conbench currently speaks is seconds (because Conbench was initially for macro not micro benchmarking). I suspect most places in Conbench would work just fine if nanoseconds were passed in, but I need to audit the code for any places that assume seconds if it isn't a throughput benchmark. - If the Criterion benchmarks were named better, I could tag them better in Conbench. For example, I suspect sqrt_20_12, sqrt_20_9, sqrt_22_12, and sqrt_22_14 are parameterized variations of the same benchmark, and if they were named something like "sqrt, foo=20, bar=12", I could batch them together & tag their parameters so that Conbench would automatically graph them in relation to each other. I was sort of able to do this with the following benchmarks (because there was a machine readable pattern). Anyhoo, that's easy enough to do down the road as a last integration step, and it does appear from the Criterion docs that they have their own recommendations for how to do this. - window partition by, u64_narrow, aggregate functions - window partition by, u64_narrow, built-in functions - window partition by, u64_wide, aggregate functions - window partition by, u64_wide, built-in functions - While Criterion benchmarks can also measure throughput in some cases, all the arrow-datafusion benchmarks were in elapsed time (not sure about the arrow-rs benchmarks), so I didn't bother writing code to support potential throughput results from arrow-datafusion/arrow-rs, but we may need to revisit that. - We probably want to add some additional context, like the arrow-rs/arrow-datafusion version, rust version, any compiler flags, etc. PS. I wasn't able to get the arrow-rs benchmarks to run, but it sounds like they are very similar to arrow-datafusion benchmarks. I don't know Rust, so I may reach out on zulip (or wherever Arrow Rust folks talk) for help building arrow-rs. $ cargo bench unresolved imports `parquet::util::DataPageBuilder`, `parquet::util::DataPageBuilderImpl`, `parquet::util::InMemoryPageIterator` There are still the orchestration steps that need to be worked on, but all-in-all this seems very doable. I just need to negotiate some time with my day-job. Cheers, --diana On Sat, Sep 11, 2021 at 4:35 PM QP Hou <houqp....@gmail.com> wrote: > > Thanks Diana a lot for offering to help. Please see my replies inline below. > > On Sat, Sep 11, 2021 at 8:37 AM Diana Clarke > <diana.joan.cla...@gmail.com> wrote: > > If you point me to the existing benchmarks for each project and > > instructions on how to execute them, I can let you know the easiest > > integration path. > > > > For arrow-datafusion, you just need to install the rust toolchain > using `rustup` [1], then run the `cargo bench` command within the > project root. Benchmark results will be saved under the > `/target/criterion/BENCH_NAME/new` folder as raw.csv file. You can > read more about the convention at > https://bheisler.github.io/criterion.rs/book/user_guide/csv_output.html. > > For arrow-rs, it's the exact same setup. > > We have some extra TPCH integration benchmarks in datafusion, but I > think we can work on integrating them later. Getting the basic > criterion benchmarks into conbench would already be a huge win for us. > > [1]: https://rustup.rs > > > If the arrow-rs benchmarks are executable from command line and return > > parsable results (like json), it should be pretty easy to publish the > > results. > > > > By default, results are saved as csv files, but you can pass in a > `--message-format=json` argument to save the results as JSON files > instead. > > > - The arrow-rs and arrow-datafusion GitHub repositories must use > > squash merges (or Conbench would have to be extended to understand the > > 2 other GitHub merge methods). > > Yes, we are using squash merge for both repos. > > > - I'm not sure what the security implications are with respect to > > adding our ursabot integration and buildkite hooks to other > > repositories. > > > > Are you concerned about security from ursabot and buildkit's point of > views? If so, who should we reach out to discuss this matter? > > Thanks, > QP