Hi, On 2025-03-10 19:45:38 -0400, Melanie Plageman wrote: > From 7b35b1144bddf202fb4d56a9b783751a0945ba0e Mon Sep 17 00:00:00 2001 > From: Melanie Plageman <melanieplage...@gmail.com> > Date: Mon, 10 Mar 2025 17:17:38 -0400 > Subject: [PATCH v35 1/5] Increase default effective_io_concurrency to 16 > > The default effective_io_concurrency has been 1 since it was introduced > in b7b8f0b6096d2ab6e. Referencing the associated discussion [1], it > seems 1 was chosen as a conservative value that seemed unlikely to cause > regressions.
16 years... > Experimentation on high latency cloud storage as well as fast, local > nvme storage (see Discussion link) shows that even slightly higher values > improve query timings substantially. 1 actually performs worse than 0. > With effective_io_concurrency 1, we are not prefetching enough to avoid > I/O stalls, but we are issuing extra syscalls. Makes sense. > Moreover, when bitmap heap scan is converted to using the read stream > API, a prefetch distance of 1 will prevent read combining which is quite > detrimental to performance. Hm? This one surprises me. Doesn't the read stream code take some pains to still perform IO combining when effective_io_concurrency=1? It does work for seqscans, for example? > The new default is 16, which should be more appropriate in the general > case while still avoiding flooding low IOPs devices with I/O requests. Maybe s/in the general case/for common hardware/? > [1] > https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com > > Discussion: > https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com > --- > doc/src/sgml/config.sgml | 38 +++++++++---------- > src/backend/utils/misc/postgresql.conf.sample | 2 +- > src/include/storage/bufmgr.h | 2 +- > 3 files changed, 19 insertions(+), 23 deletions(-) > > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml > index d2fa5f7d1a9..8c4409fc8bf 100644 > --- a/doc/src/sgml/config.sgml > +++ b/doc/src/sgml/config.sgml > @@ -2577,36 +2577,32 @@ include_dir 'conf.d' > Sets the number of concurrent disk I/O operations that > <productname>PostgreSQL</productname> expects can be executed > simultaneously. Raising this value will increase the number of I/O > - operations that any individual > <productname>PostgreSQL</productname> session > - attempts to initiate in parallel. The allowed range is 1 to 1000, > - or zero to disable issuance of asynchronous I/O requests. Currently, > - this setting only affects bitmap heap scans. > + operations that any individual <productname>PostgreSQL</productname> > + session attempts to initiate in parallel. The allowed range is > + <literal>1</literal> to <literal>1000</literal>, or > + <literal>0</literal> to disable issuance of asynchronous I/O > requests. > + The default is <literal>16</literal> on supported systems, otherwise > + <literal>0</literal>. Currently, this setting only affects bitmap > heap > + scans. > </para> I'd probably use this as an occasion to remove "Currently, this setting only affects bitmap heap" sentence - afaict it's been wrong for a while and got more wrong since vacuum started to use read streams... > <para> > - For magnetic drives, a good starting point for this setting is the > - number of separate > - drives comprising a RAID 0 stripe or RAID 1 mirror being used for > the > - database. (For RAID 5 the parity drive should not be counted.) > - However, if the database is often busy with multiple queries issued > in > - concurrent sessions, lower values may be sufficient to keep the disk > - array busy. A value higher than needed to keep the disks busy will > - only result in extra CPU overhead. > - SSDs and other memory-based storage can often process many > - concurrent requests, so the best value might be in the hundreds. Afaict this whole paragraph was *never* correct... Obviously that's not criticism of your removing it ;) > + Higher values will have the most impact on higher latency storage > + where queries otherwise experience noticeable I/O stalls and on > + devices with high IOPs. Higher values than needed to satisfy the > query > + or keep the device busy can be expected to only introduce extra CPU > + overhead. > </para> I'd say unnecessarily high values also can increase IO latency. Greetings, Andres Freund