@Justin @Merlin @ Jeff, Thanks so much for your time and insights, they improved our understanding of the underpinnings of PostgreSQL and allowed us to deal the issues we were facing. Using parallel query on our PG 9.6 improved a lot the query performance - it turns out that a lot of our real world queries could benefit of parallel query, we saw about 4x improvements after turning it on, and now we see much higher storage IOPS thanks to the multiple workers. On our tests effective_io_concurrency did not show such a large effect as the link you sent, I'll have a new look at it, maybe we are doing something wrong or the fact that the SSDs are on the SAN and not local affects the results. On the process we also learned that changing the default Linux I/O scheduler from CFQ to Deadline worked wonders for our Dell SC2020 SAN Storage setup, we used to see latency peaks of 6,000 milliseconds on busy periods (yes, 6 seconds), we now see 80 milliseconds, an almost 100 fold improvement.
Best regards, Haroldo Kerry On Wed, Jan 9, 2019 at 5:14 PM Merlin Moncure <mmonc...@gmail.com> wrote: > On Thu, Dec 27, 2018 at 7:29 PM Justin Pryzby <pry...@telsasoft.com> > wrote: > > > > On Thu, Dec 27, 2018 at 08:20:23PM -0500, Jeff Janes wrote: > > > Also, you would want to use the newest version of PostgreSQL, as 9.6 > > > doesn't have parallel query, which is much more generally applicable > than > > > effective_io_concurrency is. > > effective_io_concurrency only applies to certain queries. When it > does apply it can work wonders. See: > > https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azs...@mail.gmail.com > for an example of how it can benefit. > > parallel query is not going to help single threaded pg_bench results. > you are going to be entirely latency bound (network from bebench to > postgres, then postgres to storage). On my dell crapbox I was getting > 2200tps so you have some point of slowness relative to me, probably > not the disk itself. > > Geetting faster performance is an age-old problem; you need to > aggregate specific requests into more general ones, move the > controlling logic into the database itself, or use various other > strategies. Lowering latency is a hardware problem and can force > trade-offs (like, don't use a SAN) and has specific boundaries that > are not easy to bust through. > > merlin > > -- Haroldo Kerry CTO/COO Rua do Rócio, 220, 7° andar, conjunto 72 São Paulo – SP / CEP 04552-000 hke...@callix.com.br www.callix.com.br