Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
Hi, On Friday, June 08, 2012 01:42:22 AM Simon Riggs wrote: > On 7 June 2012 21:08, Andres Freund wrote: > >> Moved the wakeup to a logical place outside a critical section. > > > > Hm. I don't really like the way you implemented that. While it reduces > > the likelihood quite a bit it will still miss wakeups if an XLogInsert > > pushes out the data because of missing space or if any place does an > > XLogFlush(lsn). The knowledge is really only available in XLogWrite... > > Right, but the placement inside the critical section was objected to. And that objection was later somewhat eased by Tom. There currently still are several WalSndWakeup calls in critical sections and several other uses of latches in critial sections and several in signal handlers (which may be during a critical section). Thats why I added comments to SetLatch documenting that they need to be signal/concurrency safe. (Which they are atm) > This way, any caller of XLogFlush() will be swept up at least once per > wal_writer_delay, so missing a few calls doesn't mean we have spikes > in replication delay. Unfortunately it does. At least I measure it here ;) (obviously less than the prev. 7 seconds). Its not that surprising though: Especially during high or even more so during bursty wal activity a backend might decide to write out wal itself. In that case the background writer doesn't necessarily have anything to write out so it won't wake the walsender. Under high load the number of wakeups is twice or thrice as high *before* my patch than afterwards (with synchronous_commit=on obviously). As you most definitely know (group commit work et al) in a concurrent situation many of the wakeups from the current location (RecordTransactionCommit) are useless because the xlog was already flushed by another backend/background writer before we get to do it. > Doing it more frequently was also an objection from Fujii, to which we > must listen. I had hoped that I argued successfully against that, but obviously not ;) Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On 7 June 2012 21:08, Andres Freund wrote: >> Moved the wakeup to a logical place outside a critical section. > Hm. I don't really like the way you implemented that. While it reduces the > likelihood quite a bit it will still miss wakeups if an XLogInsert pushes out > the data because of missing space or if any place does an XLogFlush(lsn). > The knowledge is really only available in XLogWrite... Right, but the placement inside the critical section was objected to. This way, any caller of XLogFlush() will be swept up at least once per wal_writer_delay, so missing a few calls doesn't mean we have spikes in replication delay. Doing it more frequently was also an objection from Fujii, to which we must listen. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Thursday, June 07, 2012 08:41:23 PM Simon Riggs wrote: > On 6 June 2012 20:11, Andres Freund wrote: > > On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: > >> Hi, > >> > >> On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote: > >> > Andres Freund writes: > >> > > Does anybody have a better idea than to either call WalSndWakeup() > >> > > at essentially the wrong places or calling it inside a critical > >> > > section? > >> > > > >> > > Tom, what danger do you see from calling it in a critical section? > >> > > >> > My concern was basically that it might throw an error. Looking at the > >> > current implementation of SetLatch, it seems that's not possible, but > >> > I wonder whether we want to lock ourselves into that assumption. > >> > >> The assumption is already made at several other places I think. > >> XLogSetAsyncXactLSN does a SetLatch and is called from critical > >> sections; several signal handlers call it without any attention to the > >> context. > >> > >> Requiring it to be called outside would make its usage considerably less > >> convenient and I don't really see what could change that would require > >> to throw non-panic errors. > >> > >> > Still, if the alternatives are worse, maybe that's the best answer. > >> > If we do that, though, let's add comments to WalSndWakeup and SetLatch > >> > mentioning that they mustn't throw error. > >> > >> Patch attached. > > > > I would like to invite some more review (+commit...) here ;). Imo this is > > an annoying bug which should be fixed before next point release or > > beta/rc comes out... > > Moved the wakeup to a logical place outside a critical section. Hm. I don't really like the way you implemented that. While it reduces the likelihood quite a bit it will still miss wakeups if an XLogInsert pushes out the data because of missing space or if any place does an XLogFlush(lsn). The knowledge is really only available in XLogWrite... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On 6 June 2012 20:11, Andres Freund wrote: > On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: >> Hi, >> >> On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote: >> > Andres Freund writes: >> > > Does anybody have a better idea than to either call WalSndWakeup() at >> > > essentially the wrong places or calling it inside a critical section? >> > > >> > > Tom, what danger do you see from calling it in a critical section? >> > >> > My concern was basically that it might throw an error. Looking at the >> > current implementation of SetLatch, it seems that's not possible, but >> > I wonder whether we want to lock ourselves into that assumption. >> >> The assumption is already made at several other places I think. >> XLogSetAsyncXactLSN does a SetLatch and is called from critical sections; >> several signal handlers call it without any attention to the context. >> >> Requiring it to be called outside would make its usage considerably less >> convenient and I don't really see what could change that would require to >> throw non-panic errors. >> >> > Still, if the alternatives are worse, maybe that's the best answer. >> > If we do that, though, let's add comments to WalSndWakeup and SetLatch >> > mentioning that they mustn't throw error. >> >> Patch attached. > I would like to invite some more review (+commit...) here ;). Imo this is an > annoying bug which should be fixed before next point release or beta/rc comes > out... Moved the wakeup to a logical place outside a critical section. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: > Hi, > > On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote: > > Andres Freund writes: > > > Does anybody have a better idea than to either call WalSndWakeup() at > > > essentially the wrong places or calling it inside a critical section? > > > > > > Tom, what danger do you see from calling it in a critical section? > > > > My concern was basically that it might throw an error. Looking at the > > current implementation of SetLatch, it seems that's not possible, but > > I wonder whether we want to lock ourselves into that assumption. > > The assumption is already made at several other places I think. > XLogSetAsyncXactLSN does a SetLatch and is called from critical sections; > several signal handlers call it without any attention to the context. > > Requiring it to be called outside would make its usage considerably less > convenient and I don't really see what could change that would require to > throw non-panic errors. > > > Still, if the alternatives are worse, maybe that's the best answer. > > If we do that, though, let's add comments to WalSndWakeup and SetLatch > > mentioning that they mustn't throw error. > > Patch attached. I would like to invite some more review (+commit...) here ;). Imo this is an annoying bug which should be fixed before next point release or beta/rc comes out... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Thursday, May 31, 2012 01:33:33 AM Fujii Masao wrote: > On Wed, May 30, 2012 at 9:46 PM, Andres Freund wrote: > > On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: > >> Patch attached. > > > > Imo this patch should be backported to 9.1, 9.0 doesn't use latches and > > does not do explicit wakeup of the sender so its not applicable there. > > > > I can prepare a patch for 9.1 if people agree, there has been some amount > > of change that won't make it apply cleanly. > > The patch wakes up walsender more frequently than now. Which leads > walsender to send smaller WAL data packet more frequently, and furthermore > which leads walreceiver to issue fsync more frequently. So I'm afraid that > the patch makes the disk more busy and slows down the read-only query > in the standby. I'm also afraid that frequent fsync calls degrade the > performance > of sync replication. So it's better to do benchmark to address the > concerns. I couldn't measure any significant difference in #fsyncs or replication speed in a busy pgbench workload. If anything there were less, but the difference was small. Both with synchronous_commit=on/off. I did *not* test sync repl. Why would you expect a change? Walsender is only signalled if XLogWrite actually flushed data to disk. Thats the point of the exercise. Thats normally only done if the wal buffers are full or a commit record (+some other stuff) requires the xlog to be flushed up to some point. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Wed, May 30, 2012 at 9:46 PM, Andres Freund wrote: > On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: >> Patch attached. > Imo this patch should be backported to 9.1, 9.0 doesn't use latches and does > not do explicit wakeup of the sender so its not applicable there. > > I can prepare a patch for 9.1 if people agree, there has been some amount of > change that won't make it apply cleanly. The patch wakes up walsender more frequently than now. Which leads walsender to send smaller WAL data packet more frequently, and furthermore which leads walreceiver to issue fsync more frequently. So I'm afraid that the patch makes the disk more busy and slows down the read-only query in the standby. I'm also afraid that frequent fsync calls degrade the performance of sync replication. So it's better to do benchmark to address the concerns. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Tuesday, May 29, 2012 08:42:43 PM Andres Freund wrote: > Patch attached. Imo this patch should be backported to 9.1, 9.0 doesn't use latches and does not do explicit wakeup of the sender so its not applicable there. I can prepare a patch for 9.1 if people agree, there has been some amount of change that won't make it apply cleanly. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
Hi, On Monday, May 28, 2012 07:11:53 PM Tom Lane wrote: > Andres Freund writes: > > Does anybody have a better idea than to either call WalSndWakeup() at > > essentially the wrong places or calling it inside a critical section? > > > > Tom, what danger do you see from calling it in a critical section? > > My concern was basically that it might throw an error. Looking at the > current implementation of SetLatch, it seems that's not possible, but > I wonder whether we want to lock ourselves into that assumption. The assumption is already made at several other places I think. XLogSetAsyncXactLSN does a SetLatch and is called from critical sections; several signal handlers call it without any attention to the context. Requiring it to be called outside would make its usage considerably less convenient and I don't really see what could change that would require to throw non-panic errors. > Still, if the alternatives are worse, maybe that's the best answer. > If we do that, though, let's add comments to WalSndWakeup and SetLatch > mentioning that they mustn't throw error. Patch attached. Greetings, Andres PS: Sorry for dropping the CC list before... -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services From c4d6badac0ec07a3d4b188eab2078cffdcf57716 Mon Sep 17 00:00:00 2001 From: Andres Freund Date: Tue, 29 May 2012 20:00:16 +0200 Subject: [PATCH] Fix walsender wakeup handling The previous coding could miss xlog writeouts at several places. E.g. when wal was written out by the background writer or even after a commit if synchronous_commit=off. This could lead to delays in sending data to the standby of up to 7 seconds. To fix this move the responsibility of notification to the layer where the neccessary information is actually present. We take some care not to do the notification while we hold conteded locks like WALInsertLock or WalWriteLock locks. Document the preexisting fact that we rely on SetLatch to be safe from within signal handlers and critical sections. --- src/backend/access/transam/twophase.c | 21 - src/backend/access/transam/xact.c |7 -- src/backend/access/transam/xlog.c | 18 ++ src/backend/port/unix_latch.c |3 +++ src/backend/port/win32_latch.c|4 src/backend/replication/walsender.c | 42 - src/include/replication/walsender.h |2 ++ 7 files changed, 68 insertions(+), 29 deletions(-) diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index 6db46c0..edbbdf1 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -1041,13 +1041,6 @@ EndPrepare(GlobalTransaction gxact) /* If we crash now, we have prepared: WAL replay will fix things */ - /* - * Wake up all walsenders to send WAL up to the PREPARE record immediately - * if replication is enabled - */ - if (max_wal_senders > 0) - WalSndWakeup(); - /* write correct CRC and close file */ if ((write(fd, &statefile_crc, sizeof(pg_crc32))) != sizeof(pg_crc32)) { @@ -2048,13 +2041,6 @@ RecordTransactionCommitPrepared(TransactionId xid, /* Flush XLOG to disk */ XLogFlush(recptr); - /* - * Wake up all walsenders to send WAL up to the COMMIT PREPARED record - * immediately if replication is enabled - */ - if (max_wal_senders > 0) - WalSndWakeup(); - /* Mark the transaction committed in pg_clog */ TransactionIdCommitTree(xid, nchildren, children); @@ -2136,13 +2122,6 @@ RecordTransactionAbortPrepared(TransactionId xid, XLogFlush(recptr); /* - * Wake up all walsenders to send WAL up to the ABORT PREPARED record - * immediately if replication is enabled - */ - if (max_wal_senders > 0) - WalSndWakeup(); - - /* * Mark the transaction aborted in clog. This is not absolutely necessary * but we may as well do it while we are here. */ diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index c71a10e..d697ab8 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -1137,13 +1137,6 @@ RecordTransactionCommit(void) XLogFlush(XactLastRecEnd); /* - * Wake up all walsenders to send WAL up to the COMMIT record - * immediately if replication is enabled - */ - if (max_wal_senders > 0) - WalSndWakeup(); - - /* * Now we may update the CLOG, if we wrote a COMMIT record above */ if (markXidCommitted) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index d3650bd..ef64f33 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1014,6 +1014,8 @@ begin:; END_CRIT_SECTION(); + /* wakeup the WalSnd now that we released the WALWriteLock */ + WalSndWakeupProcess(); return RecPtr; } @@ -1215,6 +1217,9 @@ begin:; END_CRIT_SECTION(); + /* wakeup the WalSnd now that we outside conten
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
Andres Freund writes: > Does anybody have a better idea than to either call WalSndWakeup() at > essentially the wrong places or calling it inside a critical section? > Tom, what danger do you see from calling it in a critical section? My concern was basically that it might throw an error. Looking at the current implementation of SetLatch, it seems that's not possible, but I wonder whether we want to lock ourselves into that assumption. Still, if the alternatives are worse, maybe that's the best answer. If we do that, though, let's add comments to WalSndWakeup and SetLatch mentioning that they mustn't throw error. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Tuesday, May 15, 2012 05:30:27 PM Andres Freund wrote: > On Monday, May 14, 2012 07:55:32 PM Fujii Masao wrote: > > On Mon, May 14, 2012 at 6:32 PM, Andres Freund > > wrote: > > > On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote: > > >> Andres Freund writes: > > >> > Its the only place though which knows whether its actually sensible > > >> > to wakeup the walsender. We could make it return whether it wrote > > >> > anything and do the wakeup at the callers. I count 4 different > > >> > callsites which would be an annoying duplication but I don't really > > >> > see anything better right now. > > >> > > >> Another point here is that XLogWrite is not only normally called with > > >> the lock held, but inside a critical section. I see no reason to take > > >> the risk of doing signal sending inside critical sections. > > >> > > >> BTW, a depressingly large fraction of the existing calls to > > >> WalSndWakeup are also inside critical sections, generally for no good > > >> reason that I can see. For example, in EndPrepare(), why was the > > >> call placed where it is and not down beside SyncRepWaitForLSN? > > > > > > Hm. While I see no real problem moving it out of the lock I don't > > > really see a way to cleanly outside critical sections everywhere. The > > > impact of doing so seems to be rather big to me. The only externally > > > visible place where it actually is known whether we write out data and > > > thus do the wakeup is XLogInsert, XLogFlush and XLogBackgroundFlush. > > > The first two of those are routinely called inside a critical section. > > > > So what about moving the existing calls of WalSndWakeup() out of a > > critical section and adding new call of WalSndWakeup() into > > XLogBackgroundFlush()? Then all WalSndWakeup()s are called outside a > > critical section and after releasing WALWriteLock. I attached the patch. > > Imo its simply not sensible to call WalSndWakeup at *any* of the current > locations. They simply don't have the necessary information. They will > wakeup too often (because with concurrency commits often won't require > additional wal writes) and too seldom (because a flush caused by > XLogInsert wont cause a wakeup). Does anybody have a better idea than to either call WalSndWakeup() at essentially the wrong places or calling it inside a critical section? Tom, what danger do you see from calling it in a critical section? Greetings, Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Monday, May 14, 2012 07:55:32 PM Fujii Masao wrote: > On Mon, May 14, 2012 at 6:32 PM, Andres Freund wrote: > > On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote: > >> Andres Freund writes: > >> > Its the only place though which knows whether its actually sensible to > >> > wakeup the walsender. We could make it return whether it wrote > >> > anything and do the wakeup at the callers. I count 4 different > >> > callsites which would be an annoying duplication but I don't really > >> > see anything better right now. > >> > >> Another point here is that XLogWrite is not only normally called with > >> the lock held, but inside a critical section. I see no reason to take > >> the risk of doing signal sending inside critical sections. > >> > >> BTW, a depressingly large fraction of the existing calls to WalSndWakeup > >> are also inside critical sections, generally for no good reason that I > >> can see. For example, in EndPrepare(), why was the call placed where > >> it is and not down beside SyncRepWaitForLSN? > > > > Hm. While I see no real problem moving it out of the lock I don't really > > see a way to cleanly outside critical sections everywhere. The impact of > > doing so seems to be rather big to me. The only externally visible place > > where it actually is known whether we write out data and thus do the > > wakeup is XLogInsert, XLogFlush and XLogBackgroundFlush. The first two > > of those are routinely called inside a critical section. > > So what about moving the existing calls of WalSndWakeup() out of a critical > section and adding new call of WalSndWakeup() into XLogBackgroundFlush()? > Then all WalSndWakeup()s are called outside a critical section and after > releasing WALWriteLock. I attached the patch. Imo its simply not sensible to call WalSndWakeup at *any* of the current locations. They simply don't have the necessary information. They will wakeup too often (because with concurrency commits often won't require additional wal writes) and too seldom (because a flush caused by XLogInsert wont cause a wakeup). Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Mon, May 14, 2012 at 6:32 PM, Andres Freund wrote: > On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote: >> Andres Freund writes: >> > Its the only place though which knows whether its actually sensible to >> > wakeup the walsender. We could make it return whether it wrote anything >> > and do the wakeup at the callers. I count 4 different callsites which >> > would be an annoying duplication but I don't really see anything better >> > right now. >> >> Another point here is that XLogWrite is not only normally called with >> the lock held, but inside a critical section. I see no reason to take >> the risk of doing signal sending inside critical sections. > >> BTW, a depressingly large fraction of the existing calls to WalSndWakeup >> are also inside critical sections, generally for no good reason that I >> can see. For example, in EndPrepare(), why was the call placed where >> it is and not down beside SyncRepWaitForLSN? > Hm. While I see no real problem moving it out of the lock I don't really see a > way to cleanly outside critical sections everywhere. The impact of doing so > seems to be rather big to me. The only externally visible place where it > actually is known whether we write out data and thus do the wakeup is > XLogInsert, XLogFlush and XLogBackgroundFlush. The first two of those are > routinely called inside a critical section. So what about moving the existing calls of WalSndWakeup() out of a critical section and adding new call of WalSndWakeup() into XLogBackgroundFlush()? Then all WalSndWakeup()s are called outside a critical section and after releasing WALWriteLock. I attached the patch. Regards, -- Fujii Masao move_walsndwakeup_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Friday, May 11, 2012 08:45:23 PM Tom Lane wrote: > Andres Freund writes: > > Its the only place though which knows whether its actually sensible to > > wakeup the walsender. We could make it return whether it wrote anything > > and do the wakeup at the callers. I count 4 different callsites which > > would be an annoying duplication but I don't really see anything better > > right now. > > Another point here is that XLogWrite is not only normally called with > the lock held, but inside a critical section. I see no reason to take > the risk of doing signal sending inside critical sections. > BTW, a depressingly large fraction of the existing calls to WalSndWakeup > are also inside critical sections, generally for no good reason that I > can see. For example, in EndPrepare(), why was the call placed where > it is and not down beside SyncRepWaitForLSN? Hm. While I see no real problem moving it out of the lock I don't really see a way to cleanly outside critical sections everywhere. The impact of doing so seems to be rather big to me. The only externally visible place where it actually is known whether we write out data and thus do the wakeup is XLogInsert, XLogFlush and XLogBackgroundFlush. The first two of those are routinely called inside a critical section. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On 11 May 2012 19:45, Tom Lane wrote: > Andres Freund writes: >> Its the only place though which knows whether its actually sensible to wakeup >> the walsender. We could make it return whether it wrote anything and do the >> wakeup at the callers. I count 4 different callsites which would be an >> annoying duplication but I don't really see anything better right now. > > Another point here is that XLogWrite is not only normally called with > the lock held, but inside a critical section. I see no reason to take > the risk of doing signal sending inside critical sections. > > BTW, a depressingly large fraction of the existing calls to WalSndWakeup > are also inside critical sections, generally for no good reason that I > can see. For example, in EndPrepare(), why was the call placed where > it is and not down beside SyncRepWaitForLSN? I think because nobody thought of that. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
Andres Freund writes: > Its the only place though which knows whether its actually sensible to wakeup > the walsender. We could make it return whether it wrote anything and do the > wakeup at the callers. I count 4 different callsites which would be an > annoying duplication but I don't really see anything better right now. Another point here is that XLogWrite is not only normally called with the lock held, but inside a critical section. I see no reason to take the risk of doing signal sending inside critical sections. BTW, a depressingly large fraction of the existing calls to WalSndWakeup are also inside critical sections, generally for no good reason that I can see. For example, in EndPrepare(), why was the call placed where it is and not down beside SyncRepWaitForLSN? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Friday, May 11, 2012 08:36:24 PM Tom Lane wrote: > Robert Haas writes: > > That definitely doesn't seem ideal - a lot of things can pile up > > behind WALWriteLock. I'm not sure how big a problem it would be in > > practice, but we generally make a practice of avoiding sending signals > > while holding LWLocks whenever possible... > > There's a good reason for that, which is that the scheduler might well > decide to go run the wakened process instead of you. Admittedly this > tends to not be a problem on machines with $bignum CPUs, but on > single-CPU machines I've seen it happen a lot. > > Refactoring so that the signal is sent only after lock release seems > like a good idea to me. Will send a patch lateron, duplication seems to be manageable. Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
Robert Haas writes: > That definitely doesn't seem ideal - a lot of things can pile up > behind WALWriteLock. I'm not sure how big a problem it would be in > practice, but we generally make a practice of avoiding sending signals > while holding LWLocks whenever possible... There's a good reason for that, which is that the scheduler might well decide to go run the wakened process instead of you. Admittedly this tends to not be a problem on machines with $bignum CPUs, but on single-CPU machines I've seen it happen a lot. Refactoring so that the signal is sent only after lock release seems like a good idea to me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Friday, May 11, 2012 07:20:26 PM Robert Haas wrote: > On Fri, May 11, 2012 at 1:09 PM, Fujii Masao wrote: > > Calling WalSndWakeup() while WALWriteLock is being held might cause > > another performance degradation. No? > > That definitely doesn't seem ideal - a lot of things can pile up > behind WALWriteLock. I'm not sure how big a problem it would be in > practice, but we generally make a practice of avoiding sending signals > while holding LWLocks whenever possible... In my measurements on moderately powerful hardware I couldn't see any degradation on the primary - in fact the contrary, but the improvements were around 0.4% and I only tested 10min so its not exactly hard evidence. But I aggree its not ideal. Its the only place though which knows whether its actually sensible to wakeup the walsender. We could make it return whether it wrote anything and do the wakeup at the callers. I count 4 different callsites which would be an annoying duplication but I don't really see anything better right now. Better Ideas? Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Fri, May 11, 2012 at 1:09 PM, Fujii Masao wrote: > Calling WalSndWakeup() while WALWriteLock is being held might cause another > performance degradation. No? That definitely doesn't seem ideal - a lot of things can pile up behind WALWriteLock. I'm not sure how big a problem it would be in practice, but we generally make a practice of avoiding sending signals while holding LWLocks whenever possible... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On Fri, May 11, 2012 at 4:51 AM, Andres Freund wrote: > diff --git a/src/backend/access/transam/xlog.c > b/src/backend/access/transam/xlog.c > index ecb71b6..7a3224b 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool > xlog_switch) > xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush; > SpinLockRelease(&xlogctl->info_lck); > } > + > + /* the walsender wasn't woken up in xact.c */ > + if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF) > + WalSndWakeup(); > } Calling WalSndWakeup() while WALWriteLock is being held might cause another performance degradation. No? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WalSndWakeup() and synchronous_commit=off
On 10 May 2012 20:51, Andres Freund wrote: > I noticed that when synchronous_commit=off were not waking up the wal sender > latch in xact.c:RecordTransactionCommit which leads to ugly delays of approx 7 > seconds (1 + replication_timeout/10) with default settings. > Given that were flushing the wal to disk much sooner this appears to be a bad > idea - especially as this may happen even under load if we ever reach the > 'coughtup' state. Sounds like a problem. I'll have a look. > I wonder why the WalSndWakeup isn't done like: > > diff --git a/src/backend/access/transam/xlog.c > b/src/backend/access/transam/xlog.c > index ecb71b6..7a3224b 100644 > --- a/src/backend/access/transam/xlog.c > +++ b/src/backend/access/transam/xlog.c > @@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool > xlog_switch) > xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush; > SpinLockRelease(&xlogctl->info_lck); > } > + > + /* the walsender wasn't woken up in xact.c */ > + if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF) > + WalSndWakeup(); > } > > Doing that for the synchronous_commit=off case can imo be considered a bugfix, > but I wonder why we ever wake the senders somewhere else? > The only argument I can see for doing it at places like StartTransactionCommit > is that thats the place after which the data will be visible on the client. I > think thats a non-argument though because if wal is flushed to disk outside of > a commit there normally is enough data to make it worthwile. > > Doing the above results in a very noticeable reduction in lagginess and even a > noticeable reduction in cpu-usage spikes on a busy replication test setup. > > Greetings, > > Andres > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] WalSndWakeup() and synchronous_commit=off
Hi all, I noticed that when synchronous_commit=off were not waking up the wal sender latch in xact.c:RecordTransactionCommit which leads to ugly delays of approx 7 seconds (1 + replication_timeout/10) with default settings. Given that were flushing the wal to disk much sooner this appears to be a bad idea - especially as this may happen even under load if we ever reach the 'coughtup' state. I wonder why the WalSndWakeup isn't done like: diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index ecb71b6..7a3224b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1906,6 +1906,10 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch) xlogctl->LogwrtRqst.Flush = LogwrtResult.Flush; SpinLockRelease(&xlogctl->info_lck); } + + /* the walsender wasn't woken up in xact.c */ + if(max_wal_senders > 1 && synchronous_commit == SYNCHRONOUS_COMMIT_OFF) + WalSndWakeup(); } Doing that for the synchronous_commit=off case can imo be considered a bugfix, but I wonder why we ever wake the senders somewhere else? The only argument I can see for doing it at places like StartTransactionCommit is that thats the place after which the data will be visible on the client. I think thats a non-argument though because if wal is flushed to disk outside of a commit there normally is enough data to make it worthwile. Doing the above results in a very noticeable reduction in lagginess and even a noticeable reduction in cpu-usage spikes on a busy replication test setup. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers