Re: PostgreSQL Read IOPS limit per connection
Hello, I am happy to hear that you have received all the help. Please feel free to contact us for professional assistance any time you may need in the future. Most Welcome! Regards, Mark Avinash Hogg Director of Business Development 2ndQuadrant +1(647) 770 9821 Cell www.2ndquadrant.com mark.h...@2ndquadrant.com On Wed, 9 Jan 2019 at 19:20, Merlin Moncure (via Accelo) wrote: > On Wed, Jan 9, 2019 at 3:52 PM Haroldo Kerry wrote: > >> @Justin @Merlin @ Jeff, >> Thanks so much for your time and insights, they improved our >> understanding of the underpinnings of PostgreSQL and allowed us to deal the >> issues we were facing. >> Using parallel query on our PG 9.6 improved a lot the query performance - >> it turns out that a lot of our real world queries could benefit of parallel >> query, we saw about 4x improvements after turning it on, and now we see >> much higher storage IOPS thanks to the multiple workers. >> On our tests effective_io_concurrency did not show such a large effect as >> the link you sent, I'll have a new look at it, maybe we are doing something >> wrong or the fact that the SSDs are on the SAN and not local affects the >> results. >> On the process we also learned that changing the default Linux I/O >> scheduler from CFQ to Deadline worked wonders for our Dell SC2020 SAN >> Storage setup, we used to see latency peaks of 6,000 milliseconds on busy >> periods (yes, 6 seconds), we now see 80 milliseconds, an almost 100 fold >> improvement. >> > > The links sent was using a contrived query to force a type of scan that > benefits from that kind of query; it's a very situational benefit. It > would be interesting if you couldn't reproduce using the same mechanic. > > merlin > >>
Re: PostgreSQL Read IOPS limit per connection
On Wed, Jan 9, 2019 at 3:52 PM Haroldo Kerry wrote: > @Justin @Merlin @ Jeff, > Thanks so much for your time and insights, they improved our understanding > of the underpinnings of PostgreSQL and allowed us to deal the issues we > were facing. > Using parallel query on our PG 9.6 improved a lot the query performance - > it turns out that a lot of our real world queries could benefit of parallel > query, we saw about 4x improvements after turning it on, and now we see > much higher storage IOPS thanks to the multiple workers. > On our tests effective_io_concurrency did not show such a large effect as > the link you sent, I'll have a new look at it, maybe we are doing something > wrong or the fact that the SSDs are on the SAN and not local affects the > results. > On the process we also learned that changing the default Linux I/O > scheduler from CFQ to Deadline worked wonders for our Dell SC2020 SAN > Storage setup, we used to see latency peaks of 6,000 milliseconds on busy > periods (yes, 6 seconds), we now see 80 milliseconds, an almost 100 fold > improvement. > The links sent was using a contrived query to force a type of scan that benefits from that kind of query; it's a very situational benefit. It would be interesting if you couldn't reproduce using the same mechanic. merlin >
Re: PostgreSQL Read IOPS limit per connection
@Justin @Merlin @ Jeff, Thanks so much for your time and insights, they improved our understanding of the underpinnings of PostgreSQL and allowed us to deal the issues we were facing. Using parallel query on our PG 9.6 improved a lot the query performance - it turns out that a lot of our real world queries could benefit of parallel query, we saw about 4x improvements after turning it on, and now we see much higher storage IOPS thanks to the multiple workers. On our tests effective_io_concurrency did not show such a large effect as the link you sent, I'll have a new look at it, maybe we are doing something wrong or the fact that the SSDs are on the SAN and not local affects the results. On the process we also learned that changing the default Linux I/O scheduler from CFQ to Deadline worked wonders for our Dell SC2020 SAN Storage setup, we used to see latency peaks of 6,000 milliseconds on busy periods (yes, 6 seconds), we now see 80 milliseconds, an almost 100 fold improvement. Best regards, Haroldo Kerry On Wed, Jan 9, 2019 at 5:14 PM Merlin Moncure wrote: > On Thu, Dec 27, 2018 at 7:29 PM Justin Pryzby > wrote: > > > > On Thu, Dec 27, 2018 at 08:20:23PM -0500, Jeff Janes wrote: > > > Also, you would want to use the newest version of PostgreSQL, as 9.6 > > > doesn't have parallel query, which is much more generally applicable > than > > > effective_io_concurrency is. > > effective_io_concurrency only applies to certain queries. When it > does apply it can work wonders. See: > > https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH=HWh1WbLRsioe=mzRJTHwtr=2azs...@mail.gmail.com > for an example of how it can benefit. > > parallel query is not going to help single threaded pg_bench results. > you are going to be entirely latency bound (network from bebench to > postgres, then postgres to storage). On my dell crapbox I was getting > 2200tps so you have some point of slowness relative to me, probably > not the disk itself. > > Geetting faster performance is an age-old problem; you need to > aggregate specific requests into more general ones, move the > controlling logic into the database itself, or use various other > strategies. Lowering latency is a hardware problem and can force > trade-offs (like, don't use a SAN) and has specific boundaries that > are not easy to bust through. > > merlin > > -- Haroldo Kerry CTO/COO Rua do Rócio, 220, 7° andar, conjunto 72 São Paulo – SP / CEP 04552-000 hke...@callix.com.br www.callix.com.br
Re: PostgreSQL Read IOPS limit per connection
> > > *Performance issue:* > > I’m trying to figure out if PostgreSQL (PG) has some inherent limit on > IOPS per connection. > > Running pgbench with multiple clients (-c 30) we are able to see 20K+ IOPS > , which is what we expect. But, if we use just one client, we get 1200 > IOPS, avg disk queue size around 1: > The default transaction done by pgbench simply has no opportunity for dispatching multiple io requests per connection. It just a series of single-row lookups and single-row updates or inserts. You will have to use a different benchmark if you want to exercise this area. Probably something analytics heavy. Also, you would want to use the newest version of PostgreSQL, as 9.6 doesn't have parallel query, which is much more generally applicable than effective_io_concurrency is. One of the issues I’m trying to solve is related to extracting data from a > large table, which users a full table scan. We see the same 1200 IOPS limit > of pgbench when we SELECT on this table using just one connection. If there > is a limitation per connection, I might set up the application to have > several connections, and then issue SELECTs for different sections of the > table, and later join the data, but it looks cumbersome, especially if the > DB can do extract data using more IOPS. > The kernel should detect a sequential read in progress and invoke readahead. That should be able to keep the CPU quite busy with data for any decent IO system. Are you sure IO is even the bottleneck for your query? Perhaps your kernel readahead settings need to be tuned. Also, you may benefit from parallel query features implemented in newer versions of PostgreSQL. In any event, the default transactions of pgbench are not going to be useful for benchmarking what you care about. Cheers, Jeff
Re: PostgreSQL Read IOPS limit per connection
Justin, Thanks for the quick response, I'll check it out. Happy holidays, Haroldo Kerry On Thu, Dec 27, 2018 at 2:55 PM Justin Pryzby wrote: > On Thu, Dec 27, 2018 at 02:44:55PM -0200, Haroldo Kerry wrote: > > PostgreSQL 9.6.10 on x86_64-pc-linux-gnu (Debian 9.6.10-1.pgdg80+1), > > > Connected to SAN: Dell Compellent SC2020, with 7 x Samsung PM1633 SSDs > > > https://www.samsung.com/us/labs/pdfs/collateral/pm1633-prodoverview-2015.pdf > , > > RAID10+RAID5 configuration, 8GB Cache, read-write battery backed cache > > enabled, connected via dedicated iSCSI switches and dedicated Ethernet > > ports, in link aggregation mode (2x1Gbps max bandwidth). > > > I’m trying to figure out if PostgreSQL (PG) has some inherent limit on > IOPS > > per connection. > > postgres uses one server backend per client. > > > We tried to increase effective_io_size from 1 to 30, to no effect on > > multiple tests. > > > https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR > => "Currently, this setting only affects bitmap heap scans." > > > Is this some inherent limitation of PG or am I misunderstanding > something? > > It is a hsitoric limitation, but nowadays there's parallel query, which > uses > 2ndary "backend worker" processes. > > It's supported in v9.6 but much more versatile in v10 and v11. > > Justin > -- Haroldo Kerry CTO/COO Rua do Rócio, 220, 7° andar, conjunto 72 São Paulo – SP / CEP 04552-000 hke...@callix.com.br www.callix.com.br
Re: PostgreSQL Read IOPS limit per connection
On Thu, Dec 27, 2018 at 02:44:55PM -0200, Haroldo Kerry wrote: > PostgreSQL 9.6.10 on x86_64-pc-linux-gnu (Debian 9.6.10-1.pgdg80+1), > Connected to SAN: Dell Compellent SC2020, with 7 x Samsung PM1633 SSDs > https://www.samsung.com/us/labs/pdfs/collateral/pm1633-prodoverview-2015.pdf, > RAID10+RAID5 configuration, 8GB Cache, read-write battery backed cache > enabled, connected via dedicated iSCSI switches and dedicated Ethernet > ports, in link aggregation mode (2x1Gbps max bandwidth). > I’m trying to figure out if PostgreSQL (PG) has some inherent limit on IOPS > per connection. postgres uses one server backend per client. > We tried to increase effective_io_size from 1 to 30, to no effect on > multiple tests. https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR => "Currently, this setting only affects bitmap heap scans." > Is this some inherent limitation of PG or am I misunderstanding something? It is a hsitoric limitation, but nowadays there's parallel query, which uses 2ndary "backend worker" processes. It's supported in v9.6 but much more versatile in v10 and v11. Justin