On 06/19/2018 04:50 PM, Konstantin Knizhnik wrote:


On 19.06.2018 16:57, Ants Aasma wrote:
On Tue, Jun 19, 2018 at 4:04 PM Tomas Vondra <tomas.von...@2ndquadrant.com <mailto:tomas.von...@2ndquadrant.com>> wrote:

    Right. My point is that while spawning bgworkers probably helps, I
    don't
    expect it to be enough to fill the I/O queues on modern storage
    systems.
    Even if you start say 16 prefetch bgworkers, that's not going to be
    enough for large arrays or SSDs. Those typically need way more
    than 16
    requests in the queue.

    Consider for example [1] from 2014 where Merlin reported how S3500
    (Intel SATA SSD) behaves with different effective_io_concurrency
    values:

    [1]
    
https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azs...@mail.gmail.com

    Clearly, you need to prefetch 32/64 blocks or so. Consider you may
    have
    multiple such devices in a single RAID array, and that this device is
    from 2014 (and newer flash devices likely need even deeper queues).'


For reference, a typical datacenter SSD needs a queue depth of 128 to saturate a single device. [1] Multiply that appropriately for RAID arrays.So

How it is related with results for S3500  where this is almost now performance improvement for effective_io_concurrency >8? Starting 128 or more workers for performing prefetch is definitely not acceptable...


I'm not sure what you mean by "almost now performance improvement", but I guess you meant "almost no performance improvement" instead?

If that's the case, it's not quite true - increasing the queue depth above 8 further improved the throughput by about ~10-20% (both by duration and peak throughput measured by iotop).

But more importantly, this is just a single device - you typically have multiple of them in a larger arrays, to get better capacity, performance and/or reliability. So if you have 16 such drives, and you want to send at least 8 requests to each, suddenly you need at least 128 requests.

And as pointed out before, S3500 is about 5-years old device (it was introduced in Q2/2013). On newer devices the difference is usually way more significant / the required queue depth is much higher.

Obviously, this is a somewhat simplified view, ignoring various details (e.g. that there may be multiple concurrent queries, each sending I/O requests - what matters is the combined number of requests, of course). But I don't think this makes a huge difference.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply via email to