Hi,

Some thoughts:
1. For async IO, the system must have threads that quickly service the
callback. Otherwise the S3/GCS end will close the connection. A single
thread pool where all the threads are doing an expensive compute operation
(like CSV decoding or regex matching) can starve the IO.
2. When dealing with S3/GCS, the https decoding is quite expensive even
with AES intrinsics. We routinely see 3-4 cores fully busy when the compute
part is not demanding. Multiple IO threads are needed.
3. The time to first byte for S3/GCS and the download rate are often much
slower than the processing rate of the compute functions -- e.g, when you
can use parquet stats to discard row groups. The system throughput
therefore increases with pre-fetching of several requests. At work we have
an async pipeline with a fixed memory allocation per IO stream. (We also
have the luxury of knowing the file size ahead of time...)
4. Downloading from GCS/S3 too many requests at once can lead to contention
within openssl. I recently measured the same total download time whether
using 4 threads or 10 threads based with 500 or 5000 concurrent requests to
GCS.

My suggestion would be 2 threads pools: One for IO with few threads and
higher priority than a second compute thread pool with number of threads =
# of cores. All the nice features documented in tokio apply to the
compute thread pool. The total number of outstanding IO requests should be
limited to avoid thrashing.

Presumably the thread pools can be hidden being an implementation that
accepts IO tasks and compute tasks.

My 2 bits,

Cheers, Pierre

Le mar. 29 sept. 2020 à 03:31, Weston Pace <weston.p...@gmail.com> a écrit :

> Antoine/Wes, thanks for the input.  I will focus on the CSV reader and
> the minimal async needed to get I/O off the thread pool and support
> for a nested task group.  This is just to focus on one small thing at
> a time.  I'll avoid any scheduler work for now but maybe can look at
> that in the future.
>
> As for your feedback, I think #3 (adding items to the end of the
> thread pool) could also be mitigated if a promise executed it's
> callbacks directly (instead of submitting them as new tasks).  There
> is a bit of a "max recursion" case that has to be looked after
> (similar to what Antoine mentioned) but it could be handled.  I may
> experiment with that some.  The Tokio article you posted also talked
> about this (keeping a spot open for the last thing scheduled and
> running that if possible).
>
> #5 sounds pretty straightforward but I think you'd want a wide variety
> of test cases to make sure you're improving things overall.  You could
> exceed a thread pool with just a single workload.  The CSV reader, for
> example, will grow to occupy as many threads as there are available
> (assuming there are enough columns).  There are a lot of things to
> balance for here, balancing for cache cohesion, balancing I/O vs. CPU
> workload, balancing for fairness.  It may not be obvious what exactly
> to aim for.
>
>
> On Mon, Sep 28, 2020 at 2:32 AM Antoine Pitrou <anto...@python.org> wrote:
> >
> > Le 28/09/2020 à 11:38, Antoine Pitrou a écrit :
> > >
> > > Hi Weston,
> > >
> > > Le 25/09/2020 à 23:21, Weston Pace a écrit :
> > >>
> > >> * The current thread pool implementation deadlocks when used in a
> > >> "nested" case, an asynchronous solution can work around this
> > >
> > > If required it may be possible to hack around this.  For example, AFAIR
> > > TBB has a simple heuristic to enable reentrant calls into the thread
> > > pool until a hardcoded recursion level.
> >
> > Closely related: "TaskGroup::Finish should execute tasks"
> > https://issues.apache.org/jira/browse/ARROW-10014
> >
> > Regards
> >
> > Antoine.
>

Reply via email to