1) As a rule of thumb I would probably prefer `async_scheduler`.  It's more
feature rich and simpler to use and is meant to handle "long running" tasks
(e.g. 10s-100s of ms or more).

The scheduler is a bit more complex and is intended for very fine-grained
scheduling.  It's currently only used in a few nodes, I think the hash-join
and the hash-group-by for things like building the hash table (after the
build data has been accumulated).

2) Neither scheduler manages threads.  Both of them rely on the executor in
ExecContext::executor().  The scheduler takes a "schedule task callback"
which it expects to do the actual executor submission.  The async scheduler
uses futures and virtual classes.  A "task" is something that can be called
which returns a future that will be completed when the task is complete.
Most of the time this is done by submitting something to an executor (in
return for a future).  Sometimes this is done indirectly, for example, by
making an async I/O call (which under the hood is usually implemented by
submitting something to the I/O executor).

On Tue, Jul 25, 2023 at 2:56 PM Li Jin <[email protected]> wrote:

> Hi,
>
> I am reading Acero and got confused about the use of
> QueryContext::scheduler() and QueryContext::async_scheduler(). So I have a
> couple of questions:
>
> (1) What are the different purposes of these two?
> (2) Does scheduler/aysnc_scheduler own any threads inside their respective
> classes or do they use the thread pool from ExecContext::executor()?
>
> Thanks,
> Li
>

Reply via email to