On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov <i.kartys...@postgrespro.ru> wrote: > Here I attached rebased patch waitlsn_10dev_v3 (core feature) > I will leave the choice of implementation (core/contrib) to the discretion > of the community. > > Will be glad to hear your suggestion about syntax, code and patch.
Hi Ivan, Here is some feedback based on a first read-through of the v4 patch. I haven't tested it yet. First, I'll restate my view of the syntax-vs-function question: I think an fmgr function is the wrong place to do this, because it doesn't work for our 2 higher isolation levels as mentioned. Most people probably use READ COMMITTED most of the time so the extension version you've proposed is certainly useful for many people and I like it, but I will vote against inclusion in core of any feature that doesn't consider all three of our isolation levels, especially if there is no way to extend support to other levels later. I don't want PostgreSQL to keep adding features that eventually force everyone to use READ COMMITTED because they want to use feature X, Y or Z. Maybe someone can think of a clever way for an extension to insert a wait for a user-supplied LSN *before* acquiring a snapshot so it can work for the higher levels, or maybe the hooks should go into core PostgreSQL so that the extension can exist as an external project not requiring a patched PostgreSQL installation, or maybe this should be done with new core syntax that extends transaction commands. Do other people have views on this? + * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group This should say PostgreSQL. +wl_lsn_updated_hook(void) +{ + uint i; + /* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ + if (counter_waitlsn % count_waitlsn == 0 + || TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) + { Doesn't this mean that if you are waiting for LSN 1234, and the primary sends that LSN and then doesn't send anything for another hour, a standby waiting in pg_waitlsn is quite likely to skip that update (either because of count_waitlsn or interval_waitlsn), and then finish up waiting for a very long time? I'm not sure if this is a good idea, but it's an idea: You could keep your update skipping logic, but make sure you always catch the important case where recovery hits the end of WAL, by invoking your callback from WaitForWALToBecomeAvailable. It could have a boolean parameter that means 'don't skip this one!'. In other words, it's OK to skip updates, but not if you know there is no more WAL available to apply (since you have no idea how long it will be for more to arrive). Calling GetCurrentTimestamp() at high frequency (after every WAL record is replayed) may be a bad idea. It's a slow system call on some operating systems. Can we use an existing timestamp for free, like recoveryLastXTime? Remembering that the motivation for using this feature is to wait for *whole transactions* to be replayed and become visible, there is no sensible reason to wait for random WAL positions inside a transaction, so if you used that then you would effectively skip all non-COMMIT records and also skip some COMMIT records that are coming down the pipe too fast. +static void +wl_own_latch(void) +{ + SpinLockAcquire(&state->l_arr[MyBackendId].slock); + OwnLatch(&state->l_arr[MyBackendId].latch); + is_latch_owned = true; + + if (state->backend_maxid < MyBackendId) + state->backend_maxid = MyBackendId; + + state->l_arr[MyBackendId].pid = MyProcPid; + SpinLockRelease(&state->l_arr[MyBackendId].slock); +} What is the point of using extra latches for this? Why not just use the standard latch? Syncrep and many other things do that. I'm not actually sure if there is ever a reason to create more latches in regular backends. SIGUSR1 will be delivered and set the main latch anyway. There are cases of specialised latches in the system, like the wal receiver latch, and I'm reliably informed that that solves problems like delivering a wakeup message without having to know which backend is currently the wal receiver (a problem we might consider solving today with a condition variable?) I don't think anything like that applies here. + for (i = 0; i <= state->backend_maxid; i++) + { + SpinLockAcquire(&state->l_arr[i].slock); + if (state->l_arr[i].pid != 0) + SetLatch(&state->l_arr[i].latch); + SpinLockRelease(&state->l_arr[i].slock); + } Once we get through the update-skipping logic above, we hit this loop which acquires spinlocks for every backend one after another and sets the latches of every backend, no matter whether they are waiting for the LSN that has been applied. Assuming we go with this scan-the-whole-array approach, why not include the LSN waited for in the array slots, so that we can avoid disturbing processes that are waiting for a later LSN? Could you talk a bit about the trade-off between this approach and a queue based approach? I would like to understand whether this really is the best way to do it. One way to use a queue would be ConditionVariableBroadcast(&state->lsn_moved), and then waiters would use a condition variable wait loop. That would make your code much simpler (you wouldn't even need your array of spinlocks + latches) but would still have the problem of processes waking up even though they're actually waiting for a later LSN. Another choice would be to create a custom wait list which actually holds the LSNs waited for in sorted order, so that we wake up exactly the right processes, cheaply, or in arbitrary order which makes insertion cheaper but search more expensive. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers