Thank you Diana for the quick turnaround! The trail run looks great. You are right that `sqrt_20_12, sqrt_20_9, sqrt_22_12, and sqrt_22_14` are just the same type of test with different parameters, so it makes sense to batch them. We can name these benchmarks however we want to make it easier for conbench to parse the metadata.
I am able to reproduce the error you got with arrow-rs. I have filed https://github.com/apache/arrow-rs/issues/770 to track the build issue. In the meantime, you can workaround it by running the `cargo bench` command within the arrow folder to avoid building benchmarks from the parquet trait. > - We probably want to add some additional context, like the arrow-rs/arrow-datafusion version, rust version, any compiler flags, etc. I agree, the rust version can be obtained through `rustc --version`. We are not setting any special compiler flag at the moment, but definitely something useful to include if we are planning to add any in the future. Please don't hesitate to reach out if there is anything we could help to unblock you. For synchronous communication, most of the Rust developers are active in the Apache slack workspace's #arrow-rust channel. On Sun, Sep 12, 2021 at 12:11 PM Diana Clarke <diana.joan.cla...@gmail.com> wrote: > > Thanks for those great instructions, QP! > > I've spiked adding the arrow-rs and arrow-datafusion benchmark results > to the Arrow Conbench server. > > https://github.com/ursacomputing/benchmarks/pull/79/files > > And I've published one example run for arrow-datafusion: > > - Example run: > https://conbench.ursa.dev/runs/62098d2f86314339a696f8c48c4ce2e7/ > - Example benchmark from run: > https://conbench.ursa.dev/benchmarks/aa841cdf78ef4e38832e075b8485ec59/ > > Some observations along the way: > > - Criterion results are in nanoseconds, but the smallest unit > Conbench currently speaks is seconds (because Conbench was initially > for macro not micro benchmarking). I suspect most places in Conbench > would work just fine if nanoseconds were passed in, but I need to > audit the code for any places that assume seconds if it isn't a > throughput benchmark. > > - If the Criterion benchmarks were named better, I could tag them > better in Conbench. For example, I suspect sqrt_20_12, sqrt_20_9, > sqrt_22_12, and sqrt_22_14 are parameterized variations of the same > benchmark, and if they were named something like "sqrt, foo=20, > bar=12", I could batch them together & tag their parameters so that > Conbench would automatically graph them in relation to each other. I > was sort of able to do this with the following benchmarks (because > there was a machine readable pattern). Anyhoo, that's easy enough to > do down the road as a last integration step, and it does appear from > the Criterion docs that they have their own recommendations for how to > do this. > > - window partition by, u64_narrow, aggregate functions > - window partition by, u64_narrow, built-in functions > - window partition by, u64_wide, aggregate functions > - window partition by, u64_wide, built-in functions > > - While Criterion benchmarks can also measure throughput in some > cases, all the arrow-datafusion benchmarks were in elapsed time (not > sure about the arrow-rs benchmarks), so I didn't bother writing code > to support potential throughput results from > arrow-datafusion/arrow-rs, but we may need to revisit that. > > - We probably want to add some additional context, like the > arrow-rs/arrow-datafusion version, rust version, any compiler flags, > etc. > > PS. I wasn't able to get the arrow-rs benchmarks to run, but it sounds > like they are very similar to arrow-datafusion benchmarks. I don't > know Rust, so I may reach out on zulip (or wherever Arrow Rust folks > talk) for help building arrow-rs. > > $ cargo bench > unresolved imports `parquet::util::DataPageBuilder`, > `parquet::util::DataPageBuilderImpl`, > `parquet::util::InMemoryPageIterator` > > There are still the orchestration steps that need to be worked on, but > all-in-all this seems very doable. I just need to negotiate some time > with my day-job. > > Cheers, > > --diana > > On Sat, Sep 11, 2021 at 4:35 PM QP Hou <houqp....@gmail.com> wrote: > > > > Thanks Diana a lot for offering to help. Please see my replies inline below. > > > > On Sat, Sep 11, 2021 at 8:37 AM Diana Clarke > > <diana.joan.cla...@gmail.com> wrote: > > > If you point me to the existing benchmarks for each project and > > > instructions on how to execute them, I can let you know the easiest > > > integration path. > > > > > > > For arrow-datafusion, you just need to install the rust toolchain > > using `rustup` [1], then run the `cargo bench` command within the > > project root. Benchmark results will be saved under the > > `/target/criterion/BENCH_NAME/new` folder as raw.csv file. You can > > read more about the convention at > > https://bheisler.github.io/criterion.rs/book/user_guide/csv_output.html. > > > > For arrow-rs, it's the exact same setup. > > > > We have some extra TPCH integration benchmarks in datafusion, but I > > think we can work on integrating them later. Getting the basic > > criterion benchmarks into conbench would already be a huge win for us. > > > > [1]: https://rustup.rs > > > > > If the arrow-rs benchmarks are executable from command line and return > > > parsable results (like json), it should be pretty easy to publish the > > > results. > > > > > > > By default, results are saved as csv files, but you can pass in a > > `--message-format=json` argument to save the results as JSON files > > instead. > > > > > - The arrow-rs and arrow-datafusion GitHub repositories must use > > > squash merges (or Conbench would have to be extended to understand the > > > 2 other GitHub merge methods). > > > > Yes, we are using squash merge for both repos. > > > > > - I'm not sure what the security implications are with respect to > > > adding our ursabot integration and buildkite hooks to other > > > repositories. > > > > > > > Are you concerned about security from ursabot and buildkit's point of > > views? If so, who should we reach out to discuss this matter? > > > > Thanks, > > QP