[jira] [Commented] (ARROW-14354) [C++] Investigate reducing I/O thread pool size to avoid CPU wastage.

Weston Pace (Jira) Fri, 29 Oct 2021 15:01:26 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17436178#comment-17436178
 ]


Weston Pace commented on ARROW-14354:
-------------------------------------

>From the [postgres 
>docs|https://www.postgresql.org/docs/9.2/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR]:

{quote}
effective_io_concurrency (integer)

    Sets the number of concurrent disk I/O operations that PostgreSQL expects 
can be executed simultaneously. Raising this value will increase the number of 
I/O operations that any individual PostgreSQL session attempts to initiate in 
parallel. The allowed range is 1 to 1000, or zero to disable issuance of 
asynchronous I/O requests. Currently, this setting only affects bitmap heap 
scans.

    A good starting point for this setting is the number of separate drives 
comprising a RAID 0 stripe or RAID 1 mirror being used for the database. (For 
RAID 5 the parity drive should not be counted.) However, if the database is 
often busy with multiple queries issued in concurrent sessions, lower values 
may be sufficient to keep the disk array busy. A value higher than needed to 
keep the disks busy will only result in extra CPU overhead.

    For more exotic systems, such as memory-based storage or a RAID array that 
is limited by bus bandwidth, the correct value might be the number of I/O paths 
available. Some experimentation may be needed to find the best value.

    Asynchronous I/O depends on an effective posix_fadvise function, which some 
operating systems lack. If the function is not present then setting this 
parameter to anything but zero will result in an error. On some operating 
systems (e.g., Solaris), the function is present but does not actually do 
anything.
{quote}

SQL Server seems to have a much more complex design.  I did find some 
[interesting 
information|https://techcommunity.microsoft.com/t5/sql-server-support/how-it-works-bob-dorr-s-sql-server-i-o-presentation/ba-p/316031].
 One thing to note is that they have their own page cache so in-memory reads 
aren't really something they need to worry about.  For actual disk I/O it 
appears they do something rather complicated:

{quote}
 A statement was published many years ago that sustained disk queue length 
greater than 2 is an indication of an I/O bottleneck.  This statement is still 
true if the application is not designed to handle the situation.   SQL Server 
is designed to push disk queue lengths above 2 when it is appropriate.

SQL Server uses async I/O to help maximize resource usage.   SQL Server 
understands that it can hand off an I/O request to the I/O subsystem and 
continue with other activity.   Let’s look an example of this.

SQL Server checkpoint posts up to 100 I/O requests and monitors the I/O 
response time in order to properly throttle checkpoint impact.  When the I/O 
response time exceeds the target the number of I/Os is throttled.    The disk 
queue length can easily exceed 2 and not be an indication of a subsystem 
problem.  SQL Server is attempting to maximize the I/O channel. 
{quote}

If I read these things correctly they basically keep adding concurrent requests 
until the response times start to slow down.

> [C++] Investigate reducing I/O thread pool size to avoid CPU wastage.
> ---------------------------------------------------------------------
>
>                 Key: ARROW-14354
>                 URL: https://issues.apache.org/jira/browse/ARROW-14354
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> If we are reading over HTTP (e.g. S3) we generally want high parallelism in 
> the I/O thread pool.
> If we are reading from disk then high parallelism is usually harmless but 
> ineffective.  Most of the I/O threads will spend their time in a waiting 
> state and the cores can be used for other work.
> However, it appears that when we are reading locally, and the data is cached 
> in memory, then having too much parallelism will be harmful, but some 
> parallelism is beneficial.  Once the DRAM <-> CPU bandwidth limit is hit then 
> all reading threads will experience high DRAM latency.  Unlike an I/O 
> bottleneck a RAM bottleneck will waste cycles on the physical core.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (ARROW-14354) [C++] Investigate reducing I/O thread pool size to avoid CPU wastage.

Reply via email to