Re: [DataFusion] Question about async/await?

QP Hou Sun, 12 Sep 2021 21:15:39 -0700

Hi Renjie,

If by datafusion benchmarks, you are referring to the code in the
datafusion/benches folder, then those benchmarks are executed with
tokio runtime.

You are correct that one should schedule compute bound tasks into a
separate task managed by a dedicated thread to avoid blocking the
async runtime main thread. This practice applies to not just tokio,
but any other async runtime in general.

The tokio runtime used in the benchmark is initiated with
`tokio::runtime::Runtime::new()`. Tokeio in datafusion/Cargo.toml is
pulled in with the `rt-multi-thread` feature flag. So I believe by
default it creates the runtime with a multi-thread scheduler. I don't
think it matters that much for benchmarks though, because in those
benchmark code, we call `Runtime::block_on` when executing the async
query code.

On Sat, Sep 11, 2021 at 7:38 PM Renjie Liu <liurenjie2...@gmail.com> wrote:
>
> Hi, all:
> I see that the executor trait is marked as async/await in method
> definition. I have several questions:
> 1. What async/await runtime is used in benchmarking?
> 2. Tokio is the most popular async/await runtime, and they suggest to put
> long running tasks in separate thread pool rather than using tokio runtime
> directly, and you can find this here <https://docs.rs/tokio/1.11.0/tokio/>
>
> > If your code is CPU-bound and you wish to limit the number of threads used
> > to run it, you should run it on another thread pool such as rayon
> > <https://docs.rs/rayon>.
> >
> So my second question is did you test against thread pool execution mode?
>
> It would be highly appreciated if you can answer my question.
> --
> Renjie Liu
> Software Engineer, MVAD

Re: [DataFusion] Question about async/await?

Reply via email to