Weston Pace created ARROW-14354:
-----------------------------------
Summary: [C++] Investigate reducing I/O thread pool size to avoid
CPU wastage.
Key: ARROW-14354
URL: https://issues.apache.org/jira/browse/ARROW-14354
Project: Apache Arrow
Issue Type: Improvement
Components: C++
Reporter: Weston Pace
If we are reading over HTTP (e.g. S3) we generally want high parallelism in the
I/O thread pool.
If we are reading from disk then high parallelism is usually harmless but
ineffective. Most of the I/O threads will spend their time in a waiting state
and the cores can be used for other work.
However, it appears that when we are reading locally, and the data is cached in
memory, then having too much parallelism will be harmful, but some parallelism
is beneficial. Once the DRAM <-> CPU bandwidth limit is hit then all reading
threads will experience high DRAM latency. Unlike an I/O bottleneck a RAM
bottleneck will waste cycles on the physical core.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)