On 06/19/2018 04:50 PM, Konstantin Knizhnik wrote:
On 19.06.2018 16:57, Ants Aasma wrote:
On Tue, Jun 19, 2018 at 4:04 PM Tomas Vondra
<tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>>
wrote:
Right. My point is that while spawning bgworkers probably helps, I
don't
expect it to be enough to fill the I/O queues on modern storage
systems.
Even if you start say 16 prefetch bgworkers, that's not going to be
enough for large arrays or SSDs. Those typically need way more
than 16
requests in the queue.
Consider for example [1] from 2014 where Merlin reported how S3500
(Intel SATA SSD) behaves with different effective_io_concurrency
values:
[1]
https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azs...@mail.gmail.com
Clearly, you need to prefetch 32/64 blocks or so. Consider you may
have
multiple such devices in a single RAID array, and that this device is
from 2014 (and newer flash devices likely need even deeper queues).'
For reference, a typical datacenter SSD needs a queue depth of 128 to
saturate a single device. [1] Multiply that appropriately for RAID
arrays.So
How it is related with results for S3500 where this is almost now
performance improvement for effective_io_concurrency >8?
Starting 128 or more workers for performing prefetch is definitely not
acceptable...
I'm not sure what you mean by "almost now performance improvement", but
I guess you meant "almost no performance improvement" instead?
If that's the case, it's not quite true - increasing the queue depth
above 8 further improved the throughput by about ~10-20% (both by
duration and peak throughput measured by iotop).
But more importantly, this is just a single device - you typically have
multiple of them in a larger arrays, to get better capacity, performance
and/or reliability. So if you have 16 such drives, and you want to send
at least 8 requests to each, suddenly you need at least 128 requests.
And as pointed out before, S3500 is about 5-years old device (it was
introduced in Q2/2013). On newer devices the difference is usually way
more significant / the required queue depth is much higher.
Obviously, this is a somewhat simplified view, ignoring various details
(e.g. that there may be multiple concurrent queries, each sending I/O
requests - what matters is the combined number of requests, of course).
But I don't think this makes a huge difference.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services