Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
From: Fujii Masao masao.fu...@gmail.com Though I cannot show the detail for some reasons, as far as I measured the performance overhead of sync rep by using pgbench, the overhead of throughput was less than 10%. When measuring sync rep, I used two set of physical machine and storage for the master and standby, and used 1Gbps network between them. Fujii-san, thanks a million. That's valuable information. The overhead less than 10% under perhaps high concurrency and write heavy workload exceeds my expectation. Great! Though I couldn't contact the testers today, I'll tell this to them next week. Regards MauMau -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
From: Claudio Freire klaussfre...@gmail.com On Wed, May 9, 2012 at 7:34 PM, MauMau maumau...@gmail.com wrote: Yes, I understand it is natural for the response time of each transaction to double or more. But I think the throughput drop would be amortized among multiple simultaneous transactions. So, 50% throughput decrease seems unreasonable. I'm pretty sure it depends a lot on the workload. Knowing the methodology used that arrived to those figures is critical. Was the thoughput decrease measured against no replication, or asynchronous replication? How many clients were used? What was the workload like? Was it CPU bound? I/O bound? Read-mostly? We have asynchronous replication in production and thoughput has not changed relative to no replication. I cannot see how making it synchronous would change thoughput, as it only induces waiting time on the clients, but no extra work. I can only assume the test didn't use enough clients to saturate the hardware under high-latency situations, or clients were somehow experiencing application-specific contention. Thank you for your experience and opinion. The workload is TPC-C-like write-heavy one; DBT-2. They compared the throughput of synchronous replication case against that of no replication case. Today, they told me that they ran the test on two virtual machines on a single physical machine. They also used pgpool-II in both cases. In addition, they may have ran the applications and pgpool-II on the same virtual machine as the database server. It sounded to me that the resource is so scarce that concurrency was low, or your assumption may be correct. I'll hear more about their environment from them. BTW it's pity that I cannot find any case study of performance of the flagship feature of PostgreSQL 9.0/9.1, streaming replication... Regards MauMau -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On 10 Květen 2012, 13:34, MauMau wrote: The workload is TPC-C-like write-heavy one; DBT-2. They compared the throughput of synchronous replication case against that of no replication case. Today, they told me that they ran the test on two virtual machines on a single physical machine. They also used pgpool-II in both cases. In addition, they may have ran the applications and pgpool-II on the same virtual machine as the database server. So they've run a test that is usually I/O bound on a single machine? If they've used the same I/O devices, I'm surprised the degradation was just 50%. If you have a system that can handle X IOPS, and you run two instances there, each will get ~X/2 IOPS. No magic can help here. Even if they used separate I/O devices, there are probably many things that are shared and can become a bottleneck in a virtualized environment. The setup is definitely very suspicious. It sounded to me that the resource is so scarce that concurrency was low, or your assumption may be correct. I'll hear more about their environment from them. BTW it's pity that I cannot find any case study of performance of the flagship feature of PostgreSQL 9.0/9.1, streaming replication... There were some nice talks about performance impact of sync rep, for example this one: http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/SyncRepDurability.pdf There's also a video: http://www.youtube.com/watch?v=XL7j8hTd6R8 Tomas -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 5:34 PM, MauMau maumau...@gmail.com wrote: Yes, I understand it is natural for the response time of each transaction to double or more. But I think the throughput drop would be amortized among multiple simultaneous transactions. So, 50% throughput decrease seems unreasonable. If this thinking is correct, and some could kindly share his/her past performance evaluation results (ideally of DBT-2), I want to say to my acquaintance hey, community people experience better performance, so you may need to review your configuration. It seems theoretically possible to interleave the processing on both sides but 50% reduction in throughput for latency bound transactions seems to be broadly advertised as what to reasonably expect for sync rep with 9.1. 9.2 beta is arriving shortly and when it does I suggest experimenting with the new remote_write feature of sync_rep over non-production workloads. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
From: Tomas Vondra t...@fuzzy.cz There were some nice talks about performance impact of sync rep, for example this one: http://www.2ndquadrant.com/static/2quad/media/pdfs/talks/SyncRepDurability.pdf There's also a video: http://www.youtube.com/watch?v=XL7j8hTd6R8 Thanks. The video is especially interesting. I'll tell my aquaintance to check it, too. Regards MauMau -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Thu, May 10, 2012 at 8:34 PM, MauMau maumau...@gmail.com wrote: Today, they told me that they ran the test on two virtual machines on a single physical machine. They also used pgpool-II in both cases. In addition, they may have ran the applications and pgpool-II on the same virtual machine as the database server. So they compared the throughput of one server running on a single machine (non replication case) with that of two servers (i.e., master and standby) running on the same single machine (sync rep case)? The amount of CPU/Mem/IO resource available per server is not the same between those two cases. So ISTM it's very unfair for sync rep case. In this situation, I'm not surprised if I see 50% performance degradation in sync rep case. It sounded to me that the resource is so scarce that concurrency was low, or your assumption may be correct. I'll hear more about their environment from them. BTW it's pity that I cannot find any case study of performance of the flagship feature of PostgreSQL 9.0/9.1, streaming replication... Though I cannot show the detail for some reasons, as far as I measured the performance overhead of sync rep by using pgbench, the overhead of throughput was less than 10%. When measuring sync rep, I used two set of physical machine and storage for the master and standby, and used 1Gbps network between them. Regards, -- Fujii Masao -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
Hello, I've heard from some people that synchronous streaming replication has severe performance impact on the primary. They said that the transaction throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm sorry I haven't asked them about their testing environment, because they just gave me their experience. They think that this result is much worse than some commercial database. I'm surprised. I know that the amount of transaction logs of PostgreSQL is larger than other databases because it it logs the entire row for each update operation instead of just changed columns, and because of full page writes. But I can't (and don't want to) believe that those have such big negative impact. Does anyone have any experience of benchmarking synchronous streaming replication under TPC-C or similar write-heavy workload? Could anybody give me any performance evaluation result if you don't mind? Regards MauMau -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 8:06 AM, MauMau maumau...@gmail.com wrote: Hello, I've heard from some people that synchronous streaming replication has severe performance impact on the primary. They said that the transaction throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm sorry I haven't asked them about their testing environment, because they just gave me their experience. They think that this result is much worse than some commercial database. I can't speak for other databases, but it's only natural to assume that tps must drop. At minimum, you have to add the latency of communication and remote sync operation to your transaction time. For very short transactions this adds up to a lot of extra work relative to the transaction itself. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 3:58 PM, Merlin Moncure mmonc...@gmail.com wrote: On Wed, May 9, 2012 at 8:06 AM, MauMau maumau...@gmail.com wrote: I've heard from some people that synchronous streaming replication has severe performance impact on the primary. They said that the transaction throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm sorry I haven't asked them about their testing environment, because they just gave me their experience. They think that this result is much worse than some commercial database. I can't speak for other databases, but it's only natural to assume that tps must drop. At minimum, you have to add the latency of communication and remote sync operation to your transaction time. For very short transactions this adds up to a lot of extra work relative to the transaction itself. Actually I would expect 50% degradation if both databases run on identical hardware: the second instance needs to do the same work (i.e. write WAL AND ensure it reached the disk) before it can acknowledge. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. http://www.postgresql.org/docs/9.1/static/warm-standby.html#SYNCHRONOUS-REPLICATION I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 12:41 PM, Robert Klemme shortcut...@googlemail.com wrote: I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication That should only happen for very short transactions. IIRC, WAL records can be sent to the slaves before the transaction in the master commits, so bigger transactions would see higher parallelism. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 5:45 PM, Claudio Freire klaussfre...@gmail.com wrote: On Wed, May 9, 2012 at 12:41 PM, Robert Klemme shortcut...@googlemail.com wrote: I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication That should only happen for very short transactions. IIRC, WAL records can be sent to the slaves before the transaction in the master commits, so bigger transactions would see higher parallelism. I considered that as well. But the question is: when are they written to disk in the slave? If they are in buffer cache until data is synched to disk then you only gain a bit of advantage by earlier sending (i.e. network latency). Assuming a high bandwidth and low latency network (which you want to have in this case anyway) that gain is probably not big compared to the time it takes to ensure WAL is written to disk. I do not know implementation details but *if* the server triggers sync only after its own sync has succeeded *then* you basically have serialization and you need to wait twice the time. For small TX OTOH network overhead might relatively large compared to WAL IO (for example with a battery backed cache in the controller) that it shows. Since we do not know the test cases which lead to the 50% statement we can probably only speculate. Ultimately each individual setup and workload has to be tested. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 12:03 PM, Robert Klemme shortcut...@googlemail.com wrote: On Wed, May 9, 2012 at 5:45 PM, Claudio Freire klaussfre...@gmail.com wrote: On Wed, May 9, 2012 at 12:41 PM, Robert Klemme shortcut...@googlemail.com wrote: I am not sure whether the replicant can be triggered to commit to disk before the commit to disk on the master has succeeded; if that was the case there would be true serialization = 50%. This sounds like it could actually be the case (note the after it commits): When synchronous replication is requested the transaction will wait after it commits until it receives confirmation that the transfer has been successful. http://wiki.postgresql.org/wiki/Synchronous_replication That should only happen for very short transactions. IIRC, WAL records can be sent to the slaves before the transaction in the master commits, so bigger transactions would see higher parallelism. I considered that as well. But the question is: when are they written to disk in the slave? If they are in buffer cache until data is synched to disk then you only gain a bit of advantage by earlier sending (i.e. network latency). Assuming a high bandwidth and low latency network (which you want to have in this case anyway) that gain is probably not big compared to the time it takes to ensure WAL is written to disk. I do not know implementation details but *if* the server triggers sync only after its own sync has succeeded *then* you basically have serialization and you need to wait twice the time. For small TX OTOH network overhead might relatively large compared to WAL IO (for example with a battery backed cache in the controller) that it shows. Since we do not know the test cases which lead to the 50% statement we can probably only speculate. Ultimately each individual setup and workload has to be tested. yeah. note the upcoming 9.2 synchronous_commit=remote_write setting is intended to improve this situation by letting the transaction go a bit earlier -- the slave basically only has to acknowledge receipt of the data. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
From: Merlin Moncure mmonc...@gmail.com On Wed, May 9, 2012 at 8:06 AM, MauMau maumau...@gmail.com wrote: Hello, I've heard from some people that synchronous streaming replication has severe performance impact on the primary. They said that the transaction throughput of TPC-C like benchmark (perhaps DBT-2) decreased by 50%. I'm sorry I haven't asked them about their testing environment, because they just gave me their experience. They think that this result is much worse than some commercial database. I can't speak for other databases, but it's only natural to assume that tps must drop. At minimum, you have to add the latency of communication and remote sync operation to your transaction time. For very short transactions this adds up to a lot of extra work relative to the transaction itself. Yes, I understand it is natural for the response time of each transaction to double or more. But I think the throughput drop would be amortized among multiple simultaneous transactions. So, 50% throughput decrease seems unreasonable. If this thinking is correct, and some could kindly share his/her past performance evaluation results (ideally of DBT-2), I want to say to my acquaintance hey, community people experience better performance, so you may need to review your configuration. Regards MauMau -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Could synchronous streaming replication really degrade the performance of the primary?
On Wed, May 9, 2012 at 7:34 PM, MauMau maumau...@gmail.com wrote: I can't speak for other databases, but it's only natural to assume that tps must drop. At minimum, you have to add the latency of communication and remote sync operation to your transaction time. For very short transactions this adds up to a lot of extra work relative to the transaction itself. Yes, I understand it is natural for the response time of each transaction to double or more. But I think the throughput drop would be amortized among multiple simultaneous transactions. So, 50% throughput decrease seems unreasonable. I'm pretty sure it depends a lot on the workload. Knowing the methodology used that arrived to those figures is critical. Was the thoughput decrease measured against no replication, or asynchronous replication? How many clients were used? What was the workload like? Was it CPU bound? I/O bound? Read-mostly? We have asynchronous replication in production and thoughput has not changed relative to no replication. I cannot see how making it synchronous would change thoughput, as it only induces waiting time on the clients, but no extra work. I can only assume the test didn't use enough clients to saturate the hardware under high-latency situations, or clients were somehow experiencing application-specific contention. I don't know the code, but knowing how synchronous replication works, I would say any such drop under high concurrency would be a bug, contention among waiting processes or something like that, that needs to be fixed. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance