Hi Renjie, If by datafusion benchmarks, you are referring to the code in the datafusion/benches folder, then those benchmarks are executed with tokio runtime.
You are correct that one should schedule compute bound tasks into a separate task managed by a dedicated thread to avoid blocking the async runtime main thread. This practice applies to not just tokio, but any other async runtime in general. The tokio runtime used in the benchmark is initiated with `tokio::runtime::Runtime::new()`. Tokeio in datafusion/Cargo.toml is pulled in with the `rt-multi-thread` feature flag. So I believe by default it creates the runtime with a multi-thread scheduler. I don't think it matters that much for benchmarks though, because in those benchmark code, we call `Runtime::block_on` when executing the async query code. On Sat, Sep 11, 2021 at 7:38 PM Renjie Liu <liurenjie2...@gmail.com> wrote: > > Hi, all: > I see that the executor trait is marked as async/await in method > definition. I have several questions: > 1. What async/await runtime is used in benchmarking? > 2. Tokio is the most popular async/await runtime, and they suggest to put > long running tasks in separate thread pool rather than using tokio runtime > directly, and you can find this here <https://docs.rs/tokio/1.11.0/tokio/> > > > If your code is CPU-bound and you wish to limit the number of threads used > > to run it, you should run it on another thread pool such as rayon > > <https://docs.rs/rayon>. > > > So my second question is did you test against thread pool execution mode? > > It would be highly appreciated if you can answer my question. > -- > Renjie Liu > Software Engineer, MVAD