Re: [HACKERS] Race conditions with WAL sender PID lookups
On Tue, Jul 4, 2017 at 1:34 PM, Noah Misch wrote: > Bundling code cleanup into commits that also do something else is strictly > worse than bundling whitespace cleanup, which is itself bad: > https://postgr.es/m/flat/20160113144826.gb3379...@tornado.leadboat.com FWIW, I agree with that. I favor as well separate commits for cleanups and for fixes, so as each commit has its own goal and protects it. (The cleanups discussed on this thread have been partially done in commit 572d6ee where a bug has been fixed, not by me on the patches I submitted ;) ) -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Mon, Jul 03, 2017 at 12:14:55PM -0400, Alvaro Herrera wrote: > Michael Paquier wrote: > > > In passing, clean up some leftover braces which were used to create > > > unconditional blocks. Once upon a time these were used for > > > volatile-izing accesses to those shmem structs, which is no longer > > > required. Many other occurrences of this pattern remain. > > > > Here are the places where a cleanup can happen: > > - WalSndSetState > > - ProcessStandbyReplyMessage > > - XLogRead, 2 places > > - XLogSendLogical > > - WalRcvWaitForStartPosition > > - WalRcvDie > > - XLogWalRcvFlush > > - ProcessWalSndrMessage > > In most of the places of the WAL sender, braces could be removed to > > improve the style. For the WAL receiver, declarations are not > > necessary. As a matter of style, why not cleaning up just the WAL > > sender stuff? Changing the WAL receiver code just to remove some > > declarations would not improve readability, and would make back-patch > > more difficult. > > I think we should clean this up whenever we're modifying the surrounding > code, but otherwise we can leave well enough alone. It's not a high > priority item at any rate. Bundling code cleanup into commits that also do something else is strictly worse than bundling whitespace cleanup, which is itself bad: https://postgr.es/m/flat/20160113144826.gb3379...@tornado.leadboat.com Deferring cleanup and pushing cleanup-only commits are each good options. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Michael Paquier wrote: > Thanks Álvaro for pushing the patch. I had a second look and the > result looks good to me. > > - SpinLockAcquire(&walsnd->mutex); > + } > + pid = walsnd->pid; > The WAL receiver code used a cast to (int) in > pg_stat_get_wal_receiver(). I don't recall adding it. I added it because it made me a bit nervous to pass a pid_t to DatumGetInt32. This one is assigning to a variable of type pid_t, so it doesn't need a cast. I'm not too clear on using pid_t variables as int32 Datum-packed variables. We don't do it a lot in the backend code (I found some occurrences in contrib, but these don't inspire me a lot of confidence.) > Why not being consistent for both by removing the one of the WAL > receiver code? I can't think of any reason to remove that cast. It serves as documentation if nothing else -- it alerts the reader that something is going on. > > In passing, clean up some leftover braces which were used to create > > unconditional blocks. Once upon a time these were used for > > volatile-izing accesses to those shmem structs, which is no longer > > required. Many other occurrences of this pattern remain. > > Here are the places where a cleanup can happen: > - WalSndSetState > - ProcessStandbyReplyMessage > - XLogRead, 2 places > - XLogSendLogical > - WalRcvWaitForStartPosition > - WalRcvDie > - XLogWalRcvFlush > - ProcessWalSndrMessage > In most of the places of the WAL sender, braces could be removed to > improve the style. For the WAL receiver, declarations are not > necessary. As a matter of style, why not cleaning up just the WAL > sender stuff? Changing the WAL receiver code just to remove some > declarations would not improve readability, and would make back-patch > more difficult. I think we should clean this up whenever we're modifying the surrounding code, but otherwise we can leave well enough alone. It's not a high priority item at any rate. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Sat, Jul 1, 2017 at 7:20 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> Considering how crazy the conditions to make the information fetched >> by users inconsistent are met, I agree with that. > > Pushed. Thanks Álvaro for pushing the patch. I had a second look and the result looks good to me. - SpinLockAcquire(&walsnd->mutex); + } + pid = walsnd->pid; The WAL receiver code used a cast to (int) in pg_stat_get_wal_receiver(). I don't recall adding it. Why not being consistent for both by removing the one of the WAL receiver code? > In passing, clean up some leftover braces which were used to create > unconditional blocks. Once upon a time these were used for > volatile-izing accesses to those shmem structs, which is no longer > required. Many other occurrences of this pattern remain. Here are the places where a cleanup can happen: - WalSndSetState - ProcessStandbyReplyMessage - XLogRead, 2 places - XLogSendLogical - WalRcvWaitForStartPosition - WalRcvDie - XLogWalRcvFlush - ProcessWalSndrMessage In most of the places of the WAL sender, braces could be removed to improve the style. For the WAL receiver, declarations are not necessary. As a matter of style, why not cleaning up just the WAL sender stuff? Changing the WAL receiver code just to remove some declarations would not improve readability, and would make back-patch more difficult. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Michael Paquier wrote: > On Thu, Jun 29, 2017 at 7:52 AM, Alvaro Herrera > wrote: > > I think I'm done with the walsender half of this patch; I still need to > > review the walreceiver part. I will report back tomorrow 19:00 CLT. > > Thanks! > > > Currently, I'm considering not to backpatch any of this. > > Considering how crazy the conditions to make the information fetched > by users inconsistent are met, I agree with that. Pushed. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Noah Misch wrote: > On Wed, Jun 28, 2017 at 06:52:14PM -0400, Alvaro Herrera wrote: > > I think I'm done with the walsender half of this patch; I still need to > > review the walreceiver part. I will report back tomorrow 19:00 CLT. > > This PostgreSQL 10 open item is past due for your status update. Kindly send > a status update within 24 hours, and include a date for your subsequent status > update. Refer to the policy on open item ownership: > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com Real life took over. I'll report before tomorrow same time. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Wed, Jun 28, 2017 at 06:52:14PM -0400, Alvaro Herrera wrote: > I think I'm done with the walsender half of this patch; I still need to > review the walreceiver part. I will report back tomorrow 19:00 CLT. This PostgreSQL 10 open item is past due for your status update. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, Jun 29, 2017 at 7:52 AM, Alvaro Herrera wrote: > Michael Paquier wrote: >> On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas wrote: >> I take that you are referring to the two lookups in >> WalSndWaitForWal(), one in exec_replication_command(), one in >> WalSndLoop() and one in WalSndDone(). First let's remember that all >> the fields protected by the spinlock are only updated in a WAL sender >> process, so reading them directly is safe in this context: we know >> that no other processes will write them. And all those functions are >> called only in the context of a WAL sender process. So the copies of >> MyWalSnd that you are referring to here don't need a spinlock. Is >> there something I am missing? > > I assume you mean "reading them unlocked" when you write "reading them > directly". Both expressions sound the same to me. So yes I meant that. > If so, I agree with that rationale; I verified that it > applies to members state, sentPtr, write, sent, flush, writeLag, > sentLag, flushLag. I wonder if there's a maintainability danger: as > soon as we add a write to those members in a process other than the > walsender itself, we have a bug. I don't yet have an opinion on the > severity of that problem. > > Other struct members such as needreload are written by other processes, > so they should be always accessed with mutex held. Regarding pid, it > seems easiest if we use the rule that it must also be always accessed > with spinlock held. I propose updating the comment on WalSnd to this: > > /* > * Each walsender has a WalSnd struct in shared memory. > * > * This struct is protected by 'mutex', with two exceptions; one is > * sync_standby_priority as noted below. The other exception is that some > * members are only written by the walsender process itself, and thus that > * process is free to read those members without holding spinlock. pid and > * needreload always require the spinlock to be held for all accesses. > */ Agreed, a comment seems appropriate to me. That's in line with what Horiguchi-san has mentioned upthread. > I think I'm done with the walsender half of this patch; I still need to > review the walreceiver part. I will report back tomorrow 19:00 CLT. Thanks! > Currently, I'm considering not to backpatch any of this. Considering how crazy the conditions to make the information fetched by users inconsistent are met, I agree with that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Michael Paquier wrote: > On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas wrote: > > I think if you're going to fix it so that we take spinlocks on > > MyWalSnd in a bunch of places that we didn't take them before, it > > would make sense to fix all the places where we're accessing those > > fields without a spinlock instead of only some of them, unless there > > are good reasons why we only need it in some cases and not others, in > > which case I think the patch should add comments to each place where > > the lock was not taken explaining why it's OK. It didn't take me and > > my copy of vi very long to find instances that you didn't change. > > I take that you are referring to the two lookups in > WalSndWaitForWal(), one in exec_replication_command(), one in > WalSndLoop() and one in WalSndDone(). First let's remember that all > the fields protected by the spinlock are only updated in a WAL sender > process, so reading them directly is safe in this context: we know > that no other processes will write them. And all those functions are > called only in the context of a WAL sender process. So the copies of > MyWalSnd that you are referring to here don't need a spinlock. Is > there something I am missing? I assume you mean "reading them unlocked" when you write "reading them directly". If so, I agree with that rationale; I verified that it applies to members state, sentPtr, write, sent, flush, writeLag, sentLag, flushLag. I wonder if there's a maintainability danger: as soon as we add a write to those members in a process other than the walsender itself, we have a bug. I don't yet have an opinion on the severity of that problem. Other struct members such as needreload are written by other processes, so they should be always accessed with mutex held. Regarding pid, it seems easiest if we use the rule that it must also be always accessed with spinlock held. I propose updating the comment on WalSnd to this: /* * Each walsender has a WalSnd struct in shared memory. * * This struct is protected by 'mutex', with two exceptions; one is * sync_standby_priority as noted below. The other exception is that some * members are only written by the walsender process itself, and thus that * process is free to read those members without holding spinlock. pid and * needreload always require the spinlock to be held for all accesses. */ > Actually, perhaps it would make sense to add some Assert(am_walsender) > in this code? I don't think that's needed, since MyWalSnd will only be valid in a walsender. I think I'm done with the walsender half of this patch; I still need to review the walreceiver part. I will report back tomorrow 19:00 CLT. Currently, I'm considering not to backpatch any of this. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Sun, Jun 25, 2017 at 2:11 PM, Alvaro Herrera wrote: > Noah Misch wrote: > >> IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due >> for your status update. Please reacquaint yourself with the policy on open >> item ownership[1] and then reply immediately. If I do not hear from you by >> 2017-06-25 06:00 UTC, I will transfer this item to release management team >> ownership without further notice. > > I volunteer to own this item. Next update before Wednesday 28th 19:00 CLT. Thanks Álvaro. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Noah Misch wrote: > IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due > for your status update. Please reacquaint yourself with the policy on open > item ownership[1] and then reply immediately. If I do not hear from you by > 2017-06-25 06:00 UTC, I will transfer this item to release management team > ownership without further notice. I volunteer to own this item. Next update before Wednesday 28th 19:00 CLT. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Wed, Jun 21, 2017 at 10:52:50PM -0700, Noah Misch wrote: > On Thu, Jun 15, 2017 at 09:04:44AM +0100, Simon Riggs wrote: > > On 15 June 2017 at 02:59, Noah Misch wrote: > > > > > Formally, the causative commit is the one that removed the superuser() > > > test, > > > namely 25fff40. > > > > > > [Action required within three days. This is a generic notification.] > > > > > > The above-described topic is currently a PostgreSQL 10 open item. Simon, > > > since you committed the patch believed to have created it, you own this > > > open > > > item. If some other commit is more relevant or if this does not belong > > > as a > > > v10 open item, please let us know. Otherwise, please observe the policy > > > on > > > open item ownership[1] and send a status update within three calendar > > > days of > > > this message. Include a date for your subsequent status update. Testers > > > may > > > discover new open items at any time, and I want to plan to get them all > > > fixed > > > well in advance of shipping v10. Consequently, I will appreciate your > > > efforts > > > toward speedy resolution. Thanks. > > > > > > [1] > > > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com > > > > Thanks for letting me know. I'm on leave at present, so won't be able > > to get to this until June 20, though will make it a priority for then. > > This PostgreSQL 10 open item is past due for your status update. Kindly send > a status update within 24 hours, and include a date for your subsequent status > update. Refer to the policy on open item ownership: > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com IMMEDIATE ATTENTION REQUIRED. This PostgreSQL 10 open item is long past due for your status update. Please reacquaint yourself with the policy on open item ownership[1] and then reply immediately. If I do not hear from you by 2017-06-25 06:00 UTC, I will transfer this item to release management team ownership without further notice. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, Jun 15, 2017 at 09:04:44AM +0100, Simon Riggs wrote: > On 15 June 2017 at 02:59, Noah Misch wrote: > > > Formally, the causative commit is the one that removed the superuser() test, > > namely 25fff40. > > > > [Action required within three days. This is a generic notification.] > > > > The above-described topic is currently a PostgreSQL 10 open item. Simon, > > since you committed the patch believed to have created it, you own this open > > item. If some other commit is more relevant or if this does not belong as a > > v10 open item, please let us know. Otherwise, please observe the policy on > > open item ownership[1] and send a status update within three calendar days > > of > > this message. Include a date for your subsequent status update. Testers > > may > > discover new open items at any time, and I want to plan to get them all > > fixed > > well in advance of shipping v10. Consequently, I will appreciate your > > efforts > > toward speedy resolution. Thanks. > > > > [1] > > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com > > Thanks for letting me know. I'm on leave at present, so won't be able > to get to this until June 20, though will make it a priority for then. This PostgreSQL 10 open item is past due for your status update. Kindly send a status update within 24 hours, and include a date for your subsequent status update. Refer to the policy on open item ownership: https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On 15 June 2017 at 02:59, Noah Misch wrote: > Formally, the causative commit is the one that removed the superuser() test, > namely 25fff40. > > [Action required within three days. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 10 open item. Simon, > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > v10 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within three calendar days of > this message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping v10. Consequently, I will appreciate your efforts > toward speedy resolution. Thanks. > > [1] > https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com Thanks for letting me know. I'm on leave at present, so won't be able to get to this until June 20, though will make it a priority for then. -- Simon Riggshttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
Hi, At Thu, 8 Jun 2017 13:15:02 +0900, Michael Paquier wrote in > On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas wrote: > > I think if you're going to fix it so that we take spinlocks on > > MyWalSnd in a bunch of places that we didn't take them before, it > > would make sense to fix all the places where we're accessing those > > fields without a spinlock instead of only some of them, unless there > > are good reasons why we only need it in some cases and not others, in > > which case I think the patch should add comments to each place where > > the lock was not taken explaining why it's OK. It didn't take me and > > my copy of vi very long to find instances that you didn't change. > > I take that you are referring to the two lookups in > WalSndWaitForWal(), one in exec_replication_command(), one in > WalSndLoop() and one in WalSndDone(). First let's remember that all > the fields protected by the spinlock are only updated in a WAL sender > process, so reading them directly is safe in this context: we know > that no other processes will write them. And all those functions are > called only in the context of a WAL sender process. So the copies of > MyWalSnd that you are referring to here don't need a spinlock. Is > there something I am missing? > > Actually, perhaps it would make sense to add some Assert(am_walsender) > in this code? It is not obvious that reading MyWalSnd without spinlock is safe without knowing that it is modified only by itself. Based on that I see two ways to go. 1. Put spinlock there (WalSndWaitForWal and..) even though it is not required. It ensures safety even if it is modified by other processes for possible additional conflict with other readers. (Unless someone careless forgets lock for new code). 2. Put a comment there instead (maybe assertion is not verbose enough), containing something like the mention above. By the way, this this discussion make me think of SyncRepInitConfig. Taking SyncRepLock seems unnecessary there to me. > > I also think that should provide some analysis regarding the question > > Thomas asked - are these actually bugs, or are they OK for some > > reason? We shouldn't just plaster the code with spinlock acquisitions > > unless it's really necessary. It seems at least possible that these > > changes could cause performance regressions, and it doesn't look like > > you've done any testing or provided any analysis indicating whether > > that's likely to happen or not. > > Now taking a step back on the patch, v3 that I sent 3 weeks ago. > Additional spinlocks are taken in: > 1) SyncRepReleaseWaiters(), which is called in a WAL sender. So indeed > no spinlocks are needed here. The last patch is incorrect here. Agreed that it is not necessary, but we might should put one depending on the conclusion of the discussion. > 2) SyncRepGetSyncStandbysPriority() and SyncRepGetSyncStandbysQuorum() > as of HEAD via SyncRepGetSyncStandbys(), something that's not good if > done lock-less as this gets used for pg_stat_replication for > pg_stat_get_wal_senders(). This can return a garbage list of sync > standbys to the client as WAL senders update their flush, write and > apply positions in parallel to that, and don't care about SyncRepLock > as this gets calculated with the standby feedback messages. The > problem here is the gap between walsnd->sync_standby_priority and > walsnd->flush: > -- With a primary, and more or more sync standbys, take a checkpoint > in SyncRepGetSyncStandbysPriority() after the first check is passed. > -- Shutdown the standby and restart it. > -- The standby reconnects, initializes the WAL sender slot with > InitWalSenderSlot(). > -- Let the process of SyncRepGetSyncStandbysPriority() continue, the > sync standby does not show up. > That would be really unlikely to happen, but the code is written in > such a way that it could happen. One could also say that this gives a > closer idea that the standby disconnected but it looks wrong to me to > do this value lookup without a required lock. It doesn't seem to be a problem for pg_stat_get_wal_senders(). For SyncRepReleaseWaiters, also it doesn't seem to be a problem because next reply message runs the process again. > 3) In walreceiver.c's pg_stat_get_wal_receiver's: > - Launch pg_stat_get_wal_receiver and take a checkpoint on it. > - Pass the lookups of pid and ready_to_display. > - Make the WAL receiver stop. > - The view reports inconsistent data. If the WAL receiver data was > just initialized, the caller would get back PID values or similar 0. > Particularly a WAL receiver marked with !ready_to_display could show > data to the caller. That's not cool. Totally agreed. > 4) KeepFileRestoredFromArchive() needs a lock, as this is called from > the startup process. That's harmless, the worst that can happen is > that needreload is switched to true for a process that has been marked > with a PID of 0 because of a WAL sender restart (slot taken again and > initialized). B
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Tue, Jun 13, 2017 at 11:16:54AM +0900, Michael Paquier wrote: > On Thu, Jun 8, 2017 at 1:15 PM, Michael Paquier > wrote: > > 3) In walreceiver.c's pg_stat_get_wal_receiver's: > > - Launch pg_stat_get_wal_receiver and take a checkpoint on it. > > - Pass the lookups of pid and ready_to_display. > > - Make the WAL receiver stop. > > - The view reports inconsistent data. If the WAL receiver data was > > just initialized, the caller would get back PID values or similar 0. > > Particularly a WAL receiver marked with !ready_to_display could show > > data to the caller. That's not cool. > > This can actually leak password information to any user who is a > member of the group DEFAULT_ROLE_READ_ALL_STATS, though the window to > put hands on this password information is very small, it can be > possible if the WAL receiver gets down and restarted for a reason or > another during a maintenance window for example: > 1) The user queries pg_stat_wal_receiver, for example take a > breakpoint just after the check on walrcv->ready_to_display. > 2) Restart the primary once, forcing the standby to spawn a new WAL receiver. > 3) Breakpoint on the WAL receiver process, with something like that to help: > --- a/src/backend/replication/walreceiver.c > +++ b/src/backend/replication/walreceiver.c > @@ -243,6 +243,7 @@ WalReceiverMain(void) > > /* Fetch information required to start streaming */ > walrcv->ready_to_display = false; > + pg_usleep(1000L); /* 10s */ > strlcpy(conninfo, (char *) walrcv->conninfo, MAXCONNINFO); > strlcpy(slotname, (char *) walrcv->slotname, NAMEDATALEN); > startpoint = walrcv->receiveStart; > 4) continue the session querying pg_stat_wal_receiver, at this stage > it has information with the WAL receiver data set as > !ready_to_display. If the connection string includes a password, this > becomes visible as well. > > If queried at high frequency, pg_stat_wal_receiver may show up some > information. Postgres 9.6 includes this leak as well, but there is no real > leak non-superusers will just see NULL fields for this view. In Postgres > 10 though, any member of this group for statistics can see leaking > information. Based on that, this is an open item, so I have added it back > now to the list, moved from the section for older bugs. Formally, the causative commit is the one that removed the superuser() test, namely 25fff40. [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Simon, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, Jun 8, 2017 at 1:15 PM, Michael Paquier wrote: > 3) In walreceiver.c's pg_stat_get_wal_receiver's: > - Launch pg_stat_get_wal_receiver and take a checkpoint on it. > - Pass the lookups of pid and ready_to_display. > - Make the WAL receiver stop. > - The view reports inconsistent data. If the WAL receiver data was > just initialized, the caller would get back PID values or similar 0. > Particularly a WAL receiver marked with !ready_to_display could show > data to the caller. That's not cool. This can actually leak password information to any user who is a member of the group DEFAULT_ROLE_READ_ALL_STATS, though the window to put hands on this password information is very small, it can be possible if the WAL receiver gets down and restarted for a reason or another during a maintenance window for example: 1) The user queries pg_stat_wal_receiver, for example take a breakpoint just after the check on walrcv->ready_to_display. 2) Restart the primary once, forcing the standby to spawn a new WAL receiver. 3) Breakpoint on the WAL receiver process, with something like that to help: --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -243,6 +243,7 @@ WalReceiverMain(void) /* Fetch information required to start streaming */ walrcv->ready_to_display = false; + pg_usleep(1000L); /* 10s */ strlcpy(conninfo, (char *) walrcv->conninfo, MAXCONNINFO); strlcpy(slotname, (char *) walrcv->slotname, NAMEDATALEN); startpoint = walrcv->receiveStart; 4) continue the session querying pg_stat_wal_receiver, at this stage it has information with the WAL receiver data set as !ready_to_display. If the connection string includes a password, this becomes visible as well. If queried at high frequency, pg_stat_wal_receiver may show up some information. Postgres 9.6 includes this leak as well, but there is no real leak non-superusers will just see NULL fields for this view. In Postgres 10 though, any member of this group for statistics can see leaking information. Based on that, this is an open item, so I have added it back now to the list, moved from the section for older bugs. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
v On Thu, Jun 8, 2017 at 3:31 AM, Robert Haas wrote: > I think if you're going to fix it so that we take spinlocks on > MyWalSnd in a bunch of places that we didn't take them before, it > would make sense to fix all the places where we're accessing those > fields without a spinlock instead of only some of them, unless there > are good reasons why we only need it in some cases and not others, in > which case I think the patch should add comments to each place where > the lock was not taken explaining why it's OK. It didn't take me and > my copy of vi very long to find instances that you didn't change. I take that you are referring to the two lookups in WalSndWaitForWal(), one in exec_replication_command(), one in WalSndLoop() and one in WalSndDone(). First let's remember that all the fields protected by the spinlock are only updated in a WAL sender process, so reading them directly is safe in this context: we know that no other processes will write them. And all those functions are called only in the context of a WAL sender process. So the copies of MyWalSnd that you are referring to here don't need a spinlock. Is there something I am missing? Actually, perhaps it would make sense to add some Assert(am_walsender) in this code? > I also think that should provide some analysis regarding the question > Thomas asked - are these actually bugs, or are they OK for some > reason? We shouldn't just plaster the code with spinlock acquisitions > unless it's really necessary. It seems at least possible that these > changes could cause performance regressions, and it doesn't look like > you've done any testing or provided any analysis indicating whether > that's likely to happen or not. Now taking a step back on the patch, v3 that I sent 3 weeks ago. Additional spinlocks are taken in: 1) SyncRepReleaseWaiters(), which is called in a WAL sender. So indeed no spinlocks are needed here. The last patch is incorrect here. 2) SyncRepGetSyncStandbysPriority() and SyncRepGetSyncStandbysQuorum() as of HEAD via SyncRepGetSyncStandbys(), something that's not good if done lock-less as this gets used for pg_stat_replication for pg_stat_get_wal_senders(). This can return a garbage list of sync standbys to the client as WAL senders update their flush, write and apply positions in parallel to that, and don't care about SyncRepLock as this gets calculated with the standby feedback messages. The problem here is the gap between walsnd->sync_standby_priority and walsnd->flush: -- With a primary, and more or more sync standbys, take a checkpoint in SyncRepGetSyncStandbysPriority() after the first check is passed. -- Shutdown the standby and restart it. -- The standby reconnects, initializes the WAL sender slot with InitWalSenderSlot(). -- Let the process of SyncRepGetSyncStandbysPriority() continue, the sync standby does not show up. That would be really unlikely to happen, but the code is written in such a way that it could happen. One could also say that this gives a closer idea that the standby disconnected but it looks wrong to me to do this value lookup without a required lock. 3) In walreceiver.c's pg_stat_get_wal_receiver's: - Launch pg_stat_get_wal_receiver and take a checkpoint on it. - Pass the lookups of pid and ready_to_display. - Make the WAL receiver stop. - The view reports inconsistent data. If the WAL receiver data was just initialized, the caller would get back PID values or similar 0. Particularly a WAL receiver marked with !ready_to_display could show data to the caller. That's not cool. 4) KeepFileRestoredFromArchive() needs a lock, as this is called from the startup process. That's harmless, the worst that can happen is that needreload is switched to true for a process that has been marked with a PID of 0 because of a WAL sender restart (slot taken again and initialized). But that will just process an extra reload in a worst case. > Basically, I don't think this patch is ready to commit. We have > neither evidence that it is necessary nor evidence that it is > complete. I don't want to be unfair, here, but it seems to me you > ought to do a little more legwork before throwing this over the wall > to the pool of committers. Fair enough. Is the analysis written above more convincing? -- Michael walsnd-pid-races-v4.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On 2017-06-07 20:31, Robert Haas wrote: [...] [ Side note: Erik's report on this thread initially seemed to suggest that we needed this patch to make logical decoding stable. But my impression is that this is belied by subsequent developments on other threads, so my theory is that this patch was never really related to the problem, but rather than by the time Erik got around to testing this patch, other fixes had made the problems relatively rare, and the apparently-improved results with this patch were just chance. If that theory is wrong, it would be good to hear about it. ] Yes, agreed; I was probably mistaken. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Sat, May 20, 2017 at 8:40 AM, Michael Paquier wrote: > On Fri, May 19, 2017 at 3:01 PM, Masahiko Sawada > wrote: >> Also, as Horiguchi-san pointed out earlier, walreceiver seems need the >> similar fix. > > Actually, now that I look at it, ready_to_display should as well be > protected by the lock of the WAL receiver, so it is incorrectly placed > in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() > is lazy as well, and that's new in 10, so we have an open item here > for both of them. And I am the author for both things. No issues > spotted in walreceiverfuncs.c after review. > > I am adding an open item so as both issues are fixed in PG10. With the > WAL sender part, I think that this should be a group shot. > > So what do you think about the attached? I think if you're going to fix it so that we take spinlocks on MyWalSnd in a bunch of places that we didn't take them before, it would make sense to fix all the places where we're accessing those fields without a spinlock instead of only some of them, unless there are good reasons why we only need it in some cases and not others, in which case I think the patch should add comments to each place where the lock was not taken explaining why it's OK. It didn't take me and my copy of vi very long to find instances that you didn't change. I also think that should provide some analysis regarding the question Thomas asked - are these actually bugs, or are they OK for some reason? We shouldn't just plaster the code with spinlock acquisitions unless it's really necessary. It seems at least possible that these changes could cause performance regressions, and it doesn't look like you've done any testing or provided any analysis indicating whether that's likely to happen or not. Basically, I don't think this patch is ready to commit. We have neither evidence that it is necessary nor evidence that it is complete. I don't want to be unfair, here, but it seems to me you ought to do a little more legwork before throwing this over the wall to the pool of committers. [ Side note: Erik's report on this thread initially seemed to suggest that we needed this patch to make logical decoding stable. But my impression is that this is belied by subsequent developments on other threads, so my theory is that this patch was never really related to the problem, but rather than by the time Erik got around to testing this patch, other fixes had made the problems relatively rare, and the apparently-improved results with this patch were just chance. If that theory is wrong, it would be good to hear about it. ] -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Fri, Jun 2, 2017 at 2:21 PM, Noah Misch wrote: > On Tue, May 30, 2017 at 11:13:37AM -0700, Michael Paquier wrote: >> On Tue, May 30, 2017 at 11:09 AM, Peter Eisentraut >> wrote: >> > I don't think this is my item. Most of the behavior is old, and >> > pg_stat_get_wal_receiver() is from commit >> > b1a9bad9e744857291c7d5516080527da8219854. >> > >> > I would appreciate if another committer can take the lead on this. >> >> Those things are on Alvaro's plate for the WAL receiver portion, and I >> was the author of those patches. The WAL sender portion is older >> though, but it seems crazy to me to not fix both things at the same >> time per their similarities. > > As a 9.6 commit, b1a9bad cannot be the cause of a v10 open item. If a v10 > commit expanded the consequences of a pre-existing bug, the committer of that > v10 work owns this open item. If the bug's consequences are the same in v9.6 > and v10, this is ineligible to be an open item. Which applies? You are right. Even 1bdae16f is from 9.6. Mea culpa. I have moved that into the section of older bugs. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Sun, May 21, 2017 at 08:19:34AM +0200, Erik Rijkers wrote: > On 2017-05-21 06:37, Erik Rijkers wrote: > >With this patch on current master my logical replication tests > >(pgbench-over-logical-replication) run without errors for the first > >time in many days (even weeks). > > Unfortunately, just now another logical-replication failure occurred. The > same as I have seen all along: This creates a rebuttable presumption of logical replication being the cause of this open item. (I am not stating an opinion on whether this rebuttable presumption is true or is false.) As long as that stands and Alvaro has not explicitly requested ownership of this open item, Peter owns it. On Tue, May 30, 2017 at 11:13:37AM -0700, Michael Paquier wrote: > On Tue, May 30, 2017 at 11:09 AM, Peter Eisentraut > wrote: > > I don't think this is my item. Most of the behavior is old, and > > pg_stat_get_wal_receiver() is from commit > > b1a9bad9e744857291c7d5516080527da8219854. > > > > I would appreciate if another committer can take the lead on this. > > Those things are on Alvaro's plate for the WAL receiver portion, and I > was the author of those patches. The WAL sender portion is older > though, but it seems crazy to me to not fix both things at the same > time per their similarities. As a 9.6 commit, b1a9bad cannot be the cause of a v10 open item. If a v10 commit expanded the consequences of a pre-existing bug, the committer of that v10 work owns this open item. If the bug's consequences are the same in v9.6 and v10, this is ineligible to be an open item. Which applies? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Fri, May 19, 2017 at 2:05 PM, Michael Paquier wrote: > On Thu, May 18, 2017 at 1:43 PM, Thomas Munro > wrote: >> On Thu, May 11, 2017 at 1:48 PM, Michael Paquier >> wrote: >>> I had my eyes on the WAL sender code this morning, and I have noticed >>> that walsender.c is not completely consistent with the PID lookups it >>> does in walsender.c. In two code paths, the PID value is checked >>> without holding the WAL sender spin lock (WalSndRqstFileReload and >>> pg_stat_get_wal_senders), which looks like a very bad idea contrary to >>> what the new WalSndWaitStopping() does and what InitWalSenderSlot() is >>> doing for ages. >> >> There is also code that accesses shared walsender state without >> spinlocks over in syncrep.c. I think that file could use a few words >> of explanation for why it's OK to access pid, state and flush without >> synchronisation. > > Yes, that is read during the quorum and priority sync evaluation. > Except sync_standby_priority, all the other variables should be > protected using the spin lock of the WAL sender. walsender_private.h > is clear regarding that. So the current coding is inconsistent even > there. Attached is an updated patch. I don't claim that that stuff is wrong, just that it would be good to hear an analysis. I think the question is: is there a way for syncrep to hang because of a perfectly timed walsender transition, however vanishingly unlikely? I'm thinking of something like: syncrep skips a walsender slot because it's looking at arbitrarily old 'pid' from before a walsender connected, or gets a torn read of 'flush' that comes out as 0 but was actually non-0. Incidentally, I suspect that a couple of places where 'volatile' is used it's superfluous (accessing things protected by an LWLock that is held). I don't see any of this as 'open item' material, it's interesting to look into but it's preexisting code. As for unlocked reads used for pg_stat_X views, it seems well established that we're OK with that (at least for things that the project has decided can be read atomically, to wit aligned 32 bit values). -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Tue, May 30, 2017 at 11:09 AM, Peter Eisentraut wrote: > I don't think this is my item. Most of the behavior is old, and > pg_stat_get_wal_receiver() is from commit > b1a9bad9e744857291c7d5516080527da8219854. > > I would appreciate if another committer can take the lead on this. Those things are on Alvaro's plate for the WAL receiver portion, and I was the author of those patches. The WAL sender portion is older though, but it seems crazy to me to not fix both things at the same time per their similarities. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On 5/29/17 21:38, Noah Misch wrote: >> Actually, now that I look at it, ready_to_display should as well be >> protected by the lock of the WAL receiver, so it is incorrectly placed >> in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() >> is lazy as well, and that's new in 10, so we have an open item here >> for both of them. And I am the author for both things. No issues >> spotted in walreceiverfuncs.c after review. >> >> I am adding an open item so as both issues are fixed in PG10. With the >> WAL sender part, I think that this should be a group shot. >> >> So what do you think about the attached? > > [Action required within three days. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 10 open item. Peter, > since you committed the patch believed to have created it, you own this open > item. I don't think this is my item. Most of the behavior is old, and pg_stat_get_wal_receiver() is from commit b1a9bad9e744857291c7d5516080527da8219854. (The logical replication equivalent of that is pg_stat_get_subscription(), which does appear to use appropriate locking.) I would appreciate if another committer can take the lead on this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Sat, May 20, 2017 at 09:40:57PM +0900, Michael Paquier wrote: > On Fri, May 19, 2017 at 3:01 PM, Masahiko Sawada > wrote: > > Also, as Horiguchi-san pointed out earlier, walreceiver seems need the > > similar fix. > > Actually, now that I look at it, ready_to_display should as well be > protected by the lock of the WAL receiver, so it is incorrectly placed > in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() > is lazy as well, and that's new in 10, so we have an open item here > for both of them. And I am the author for both things. No issues > spotted in walreceiverfuncs.c after review. > > I am adding an open item so as both issues are fixed in PG10. With the > WAL sender part, I think that this should be a group shot. > > So what do you think about the attached? [Action required within three days. This is a generic notification.] The above-described topic is currently a PostgreSQL 10 open item. Peter, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a v10 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within three calendar days of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping v10. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On 2017-05-21 06:37, Erik Rijkers wrote: On 2017-05-20 14:40, Michael Paquier wrote: On Fri, May 19, 2017 at 3:01 PM, Masahiko Sawada wrote: Also, as Horiguchi-san pointed out earlier, walreceiver seems need the similar fix. Actually, now that I look at it, ready_to_display should as well be protected by the lock of the WAL receiver, so it is incorrectly placed in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() is lazy as well, and that's new in 10, so we have an open item here for both of them. And I am the author for both things. No issues spotted in walreceiverfuncs.c after review. I am adding an open item so as both issues are fixed in PG10. With the WAL sender part, I think that this should be a group shot. So what do you think about the attached? [walsnd-pid-races-v3.patch] With this patch on current master my logical replication tests (pgbench-over-logical-replication) run without errors for the first time in many days (even weeks). Unfortunately, just now another logical-replication failure occurred. The same as I have seen all along: The symptom: after starting logical replication, there are no rows in pg_stat_replication and in the replica-log logical replication complains about max_replication_slots being too low. (from previous experience I know that making max_replication_slots higher does indeed 'help', but only until the next (same) error occurs, with renewed (same) complaint). Also from previous experience of this failed state I know that it can be 'cleaned up' by manually emptying these tables: delete from pg_subscription_rel; delete from pg_subscription; delete from pg_replication_origin; Then it becomes possible to start a new subscription without the above symptoms. I'll do some more testing and hopefully get some information that's less vague... Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On 2017-05-20 14:40, Michael Paquier wrote: On Fri, May 19, 2017 at 3:01 PM, Masahiko Sawada wrote: Also, as Horiguchi-san pointed out earlier, walreceiver seems need the similar fix. Actually, now that I look at it, ready_to_display should as well be protected by the lock of the WAL receiver, so it is incorrectly placed in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() is lazy as well, and that's new in 10, so we have an open item here for both of them. And I am the author for both things. No issues spotted in walreceiverfuncs.c after review. I am adding an open item so as both issues are fixed in PG10. With the WAL sender part, I think that this should be a group shot. So what do you think about the attached? [walsnd-pid-races-v3.patch] With this patch on current master my logical replication tests (pgbench-over-logical-replication) run without errors for the first time in many days (even weeks). I'll do still more and longer tests but I have gathered already a long streak of successful runs since you posted the patch so I am getting convinced this patch is solved the problem that I was experiencing. Pity it didn't make the beta. thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Fri, May 19, 2017 at 3:01 PM, Masahiko Sawada wrote: > Also, as Horiguchi-san pointed out earlier, walreceiver seems need the > similar fix. Actually, now that I look at it, ready_to_display should as well be protected by the lock of the WAL receiver, so it is incorrectly placed in walreceiver.h. As you are pointing out, pg_stat_get_wal_receiver() is lazy as well, and that's new in 10, so we have an open item here for both of them. And I am the author for both things. No issues spotted in walreceiverfuncs.c after review. I am adding an open item so as both issues are fixed in PG10. With the WAL sender part, I think that this should be a group shot. So what do you think about the attached? -- Michael diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index ad213fc454..5ab5e95fa9 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -409,15 +409,23 @@ void SyncRepReleaseWaiters(void) { volatile WalSndCtlData *walsndctl = WalSndCtl; + WalSnd *walsnd = MyWalSnd; XLogRecPtr writePtr; XLogRecPtr flushPtr; XLogRecPtr applyPtr; + XLogRecPtr flush; + WalSndState state; bool got_recptr; bool am_sync; int numwrite = 0; int numflush = 0; int numapply = 0; + SpinLockAcquire(&walsnd->mutex); + flush = walsnd->flush; + state = walsnd->state; + SpinLockRelease(&walsnd->mutex); + /* * If this WALSender is serving a standby that is not on the list of * potential sync standbys then we have nothing to do. If we are still @@ -425,8 +433,8 @@ SyncRepReleaseWaiters(void) * still invalid, then leave quickly also. */ if (MyWalSnd->sync_standby_priority == 0 || - MyWalSnd->state < WALSNDSTATE_STREAMING || - XLogRecPtrIsInvalid(MyWalSnd->flush)) + state < WALSNDSTATE_STREAMING || + XLogRecPtrIsInvalid(flush)) { announce_next_takeover = true; return; @@ -711,14 +719,24 @@ SyncRepGetSyncStandbysQuorum(bool *am_sync) for (i = 0; i < max_wal_senders; i++) { + XLogRecPtr flush; + WalSndState state; + int pid; + walsnd = &WalSndCtl->walsnds[i]; + SpinLockAcquire(&walsnd->mutex); + pid = walsnd->pid; + flush = walsnd->flush; + state = walsnd->state; + SpinLockRelease(&walsnd->mutex); + /* Must be active */ - if (walsnd->pid == 0) + if (pid == 0) continue; /* Must be streaming */ - if (walsnd->state != WALSNDSTATE_STREAMING) + if (state != WALSNDSTATE_STREAMING) continue; /* Must be synchronous */ @@ -726,7 +744,7 @@ SyncRepGetSyncStandbysQuorum(bool *am_sync) continue; /* Must have a valid flush position */ - if (XLogRecPtrIsInvalid(walsnd->flush)) + if (XLogRecPtrIsInvalid(flush)) continue; /* @@ -780,14 +798,24 @@ SyncRepGetSyncStandbysPriority(bool *am_sync) */ for (i = 0; i < max_wal_senders; i++) { + XLogRecPtr flush; + WalSndState state; + int pid; + walsnd = &WalSndCtl->walsnds[i]; + SpinLockAcquire(&walsnd->mutex); + pid = walsnd->pid; + flush = walsnd->flush; + state = walsnd->state; + SpinLockRelease(&walsnd->mutex); + /* Must be active */ - if (walsnd->pid == 0) + if (pid == 0) continue; /* Must be streaming */ - if (walsnd->state != WALSNDSTATE_STREAMING) + if (state != WALSNDSTATE_STREAMING) continue; /* Must be synchronous */ @@ -796,7 +824,7 @@ SyncRepGetSyncStandbysPriority(bool *am_sync) continue; /* Must have a valid flush position */ - if (XLogRecPtrIsInvalid(walsnd->flush)) + if (XLogRecPtrIsInvalid(flush)) continue; /* diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c index 2723612718..25f12e0706 100644 --- a/src/backend/replication/walreceiver.c +++ b/src/backend/replication/walreceiver.c @@ -1392,23 +1392,13 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS) TimestampTz latest_end_time; char *slotname; char *conninfo; - - /* - * No WAL receiver (or not ready yet), just return a tuple with NULL - * values - */ - if (walrcv->pid == 0 || !walrcv->ready_to_display) - PG_RETURN_NULL(); - - /* determine result type */ - if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE) - elog(ERROR, "return type must be a row type"); - - values = palloc0(sizeof(Datum) * tupdesc->natts); - nulls = palloc0(sizeof(bool) * tupdesc->natts); + int pid; + bool ready_to_display; /* Take a lock to ensure value consistency */ SpinLockAcquire(&walrcv->mutex); + pid = walrcv->pid; + ready_to_display = walrcv->ready_to_display; state = walrcv->walRcvState; receive_start_lsn = walrcv->receiveStart; receive_start_tli = walrcv->receiveStartTLI; @@ -1422,8 +1412,22 @@ pg_stat_get_wal_receiver(PG_FUNCTION_ARGS) conninfo = pstrdup(walrcv->conninfo); SpinLockRelease(&walrcv->mutex); + /* + * No WAL receiver (or not ready yet), just return a tuple with NULL + * values + */ + if (pid == 0 || !ready_to_display) + PG_RETURN_NULL(); + + /* determine result type */ + if (get_call_result_type(fcinfo, NU
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Fri, May 19, 2017 at 11:05 AM, Michael Paquier wrote: > On Thu, May 18, 2017 at 1:43 PM, Thomas Munro > wrote: >> On Thu, May 11, 2017 at 1:48 PM, Michael Paquier >> wrote: >>> I had my eyes on the WAL sender code this morning, and I have noticed >>> that walsender.c is not completely consistent with the PID lookups it >>> does in walsender.c. In two code paths, the PID value is checked >>> without holding the WAL sender spin lock (WalSndRqstFileReload and >>> pg_stat_get_wal_senders), which looks like a very bad idea contrary to >>> what the new WalSndWaitStopping() does and what InitWalSenderSlot() is >>> doing for ages. >> >> There is also code that accesses shared walsender state without >> spinlocks over in syncrep.c. I think that file could use a few words >> of explanation for why it's OK to access pid, state and flush without >> synchronisation. > > Yes, that is read during the quorum and priority sync evaluation. > Except sync_standby_priority, all the other variables should be > protected using the spin lock of the WAL sender. walsender_private.h > is clear regarding that. So the current coding is inconsistent even > there. Attached is an updated patch. Also, as Horiguchi-san pointed out earlier, walreceiver seems need the similar fix. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, May 18, 2017 at 1:43 PM, Thomas Munro wrote: > On Thu, May 11, 2017 at 1:48 PM, Michael Paquier > wrote: >> I had my eyes on the WAL sender code this morning, and I have noticed >> that walsender.c is not completely consistent with the PID lookups it >> does in walsender.c. In two code paths, the PID value is checked >> without holding the WAL sender spin lock (WalSndRqstFileReload and >> pg_stat_get_wal_senders), which looks like a very bad idea contrary to >> what the new WalSndWaitStopping() does and what InitWalSenderSlot() is >> doing for ages. > > There is also code that accesses shared walsender state without > spinlocks over in syncrep.c. I think that file could use a few words > of explanation for why it's OK to access pid, state and flush without > synchronisation. Yes, that is read during the quorum and priority sync evaluation. Except sync_standby_priority, all the other variables should be protected using the spin lock of the WAL sender. walsender_private.h is clear regarding that. So the current coding is inconsistent even there. Attached is an updated patch. -- Michael walsnd-pid-races-v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, May 11, 2017 at 1:48 PM, Michael Paquier wrote: > I had my eyes on the WAL sender code this morning, and I have noticed > that walsender.c is not completely consistent with the PID lookups it > does in walsender.c. In two code paths, the PID value is checked > without holding the WAL sender spin lock (WalSndRqstFileReload and > pg_stat_get_wal_senders), which looks like a very bad idea contrary to > what the new WalSndWaitStopping() does and what InitWalSenderSlot() is > doing for ages. There is also code that accesses shared walsender state without spinlocks over in syncrep.c. I think that file could use a few words of explanation for why it's OK to access pid, state and flush without synchronisation. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Wed, May 17, 2017 at 12:14 AM, Robert Haas wrote: > Well, it's probably worth changing for consistency, but I'm not sure > that it rises to the level of "a very bad idea". It actually seems > almost entirely harmless. Spuriously setting the needreload flag on a > just-deceased WAL sender will just result in some future WAL sender > doing a bit of unnecessary work, but I don't think it breaks anything > and the probability is vanishingly low. The other change could result > a bogus 0 PID in pg_stat_get_wal_senders output, but I bet you > couldn't reproduce that more than once in a blue moon even with a test > rig designed to provoke it, and if it does happen it isn't really > anything more than a trivial annoyance. Well, the window is very low, so only tests with precisely taken breakpoints would show problems. > So I'm in favor of committing this and maybe even back-patching it, > but I also don't think it's a big deal. Thanks. I would not mind if this is seen as a HEAD-only improvement. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Wed, May 10, 2017 at 9:48 PM, Michael Paquier wrote: > I had my eyes on the WAL sender code this morning, and I have noticed > that walsender.c is not completely consistent with the PID lookups it > does in walsender.c. In two code paths, the PID value is checked > without holding the WAL sender spin lock (WalSndRqstFileReload and > pg_stat_get_wal_senders), which looks like a very bad idea contrary to > what the new WalSndWaitStopping() does and what InitWalSenderSlot() is > doing for ages. Well, it's probably worth changing for consistency, but I'm not sure that it rises to the level of "a very bad idea". It actually seems almost entirely harmless. Spuriously setting the needreload flag on a just-deceased WAL sender will just result in some future WAL sender doing a bit of unnecessary work, but I don't think it breaks anything and the probability is vanishingly low. The other change could result a bogus 0 PID in pg_stat_get_wal_senders output, but I bet you couldn't reproduce that more than once in a blue moon even with a test rig designed to provoke it, and if it does happen it isn't really anything more than a trivial annoyance. So I'm in favor of committing this and maybe even back-patching it, but I also don't think it's a big deal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
At Fri, 12 May 2017 11:44:19 +0900, Masahiko Sawada wrote in > On Thu, May 11, 2017 at 10:48 AM, Michael Paquier > wrote: > > Hi all, > > > > I had my eyes on the WAL sender code this morning, and I have noticed > > that walsender.c is not completely consistent with the PID lookups it > > does in walsender.c. In two code paths, the PID value is checked > > without holding the WAL sender spin lock (WalSndRqstFileReload and > > pg_stat_get_wal_senders), which looks like a very bad idea contrary to > > what the new WalSndWaitStopping() does and what InitWalSenderSlot() is > > doing for ages. > > > > Any thoughts about the patch attached to make things more consistent? > > It seems to me that having some safeguards would be nice for > > robustness. > > +1. I think this is a sensible change. It intends to avoid exccesive locking during looking up stats values. But we don't have so much vacant WanSnd slots in a reasonable setup. Thus it seems reasonable to read the pid value within the lock section since it adds practically no additional cost. pg_stat_get_wal_receiver seems to need the same amendment since the code is a parallel to that of wal receiver. Or, if we were too sensitive to such locks for nothing, we could use double-checked locking but I don't think we are so here. In short, +1 too and walreceiver needs the same amendment. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Race conditions with WAL sender PID lookups
On Thu, May 11, 2017 at 10:48 AM, Michael Paquier wrote: > Hi all, > > I had my eyes on the WAL sender code this morning, and I have noticed > that walsender.c is not completely consistent with the PID lookups it > does in walsender.c. In two code paths, the PID value is checked > without holding the WAL sender spin lock (WalSndRqstFileReload and > pg_stat_get_wal_senders), which looks like a very bad idea contrary to > what the new WalSndWaitStopping() does and what InitWalSenderSlot() is > doing for ages. > > Any thoughts about the patch attached to make things more consistent? > It seems to me that having some safeguards would be nice for > robustness. +1. I think this is a sensible change. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Race conditions with WAL sender PID lookups
Hi all, I had my eyes on the WAL sender code this morning, and I have noticed that walsender.c is not completely consistent with the PID lookups it does in walsender.c. In two code paths, the PID value is checked without holding the WAL sender spin lock (WalSndRqstFileReload and pg_stat_get_wal_senders), which looks like a very bad idea contrary to what the new WalSndWaitStopping() does and what InitWalSenderSlot() is doing for ages. Any thoughts about the patch attached to make things more consistent? It seems to me that having some safeguards would be nice for robustness. That's an old bug, so I am adding that to the next CF. Thanks, -- Michael walsnd-pid-races.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers