Re: [HACKERS] make async slave to wait for lsn to be replayed
On Mon, Oct 30, 2017 at 7:25 PM, Ivan Kartyshovwrote: > It sounds reasonable. I can offer the following version. > > WAIT LSN lsn_number; > WAIT LSN lsn_number TIMEOUT delay; > WAIT LSN lsn_number INFINITE; > WAIT LSN lsn_number NOWAIT; > > > WAIT [token] wait_value [option]; > > token - [LSN, TIME | TIMESTAMP | CSN | XID] > option - [TIMEOUT | INFINITE | NOWAIT] > > Ready to listen to your suggestions. Making the interface more specific about the mechanism is not what I had in mind, quite the opposite. I would like to see the interface describe the desired effect of the wait. Stepping back for a while, from what I understand the reason we want to waiting is to prevent observation of database state going backwards. To limit the amount of waiting needed we tell the database what we have observed. For example "I just observed my transaction commit", or "the last time I observed state was X", and then have the database provide us with a snapshot that is causally dependent on those states. This does not give us linearizability, for that we still need at the very least serializable transactions on standby. But it seems to provide a form of sequential consistency, which (if it can be proved to hold) makes reasoning about concurrency a lot nicer. For lack of a better proposal I would like something along the lines of: WAIT FOR state_id[, state_id] [ OPTIONS (..)] And to get the tokens maybe a function pg_last_commit_state(). Optionally, to provide read-to-read causality, pg_snapshot_state() could return for example replay_lsn at the start of the current transaction. This makes sure that things don't just appear and disappear when load balancing across many standby servers. WAIT FOR semantics is to ensure that next snapshot is causally dependent (happens after) each of the specified observed states. The state_id could simply be a LSN, or to allow for future expansion something like 'lsn:/1234'. Example extension would be to allow for waiting on xids. On master that would be just a ShareLock on the transactionid. On standby it would wait for the commit or rollback record for that transaction to be replayed. Robert made a good point that people will still rely on the token being an LSN, but perhaps they will be slightly less angry when we explicitly tell them that this might change in the future. Regards, Ants Aasma [1] https://www.postgresql.org/docs/devel/static/functions-admin.html#functions-snapshot-synchronization -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Ants Aasma писал 2017-10-26 17:29: On Mon, Oct 23, 2017 at 12:29 PM, Ivan Kartyshovwrote: Ants Aasma писал 2017-09-26 13:00: Exposing this interface as WAITLSN will encode that visibility order matches LSN order. This removes any chance of fixing for example visibility order of async/vs/sync transactions. Just renaming it so the token is an opaque commit visibility token that just happens to be a LSN would still allow for progress in transaction management. For example, making PostgreSQL distributed will likely want timestamp and/or vector clock based visibility rules. I'm sorry I did not understand exactly what you meant. Please explain this in more detail. Currently transactions on the master become visible when xid is removed from proc array. This depends on the order of acquiring ProcArrayLock, which can happen in a different order from inserting the commit record to wal. Whereas on the standby the transactions will become visible in the same order that commit records appear in wal. The difference can be quite large when transactions are using different values for synchronous_commit. Fixing this is not easy, but we may want to do it someday. IIRC CSN threads contained more discussion on this topic. If we do fix it, it seems likely that what we need to wait on is not LSN, but some other kind of value. If the UI were something like "WAITVISIBILITY token", then we can later change the token to something other than LSN. Regards, Ants Aasma It sounds reasonable. I can offer the following version. WAIT LSN lsn_number; WAIT LSN lsn_number TIMEOUT delay; WAIT LSN lsn_number INFINITE; WAIT LSN lsn_number NOWAIT; WAIT [token] wait_value [option]; token - [LSN, TIME | TIMESTAMP | CSN | XID] option - [TIMEOUT | INFINITE | NOWAIT] Ready to listen to your suggestions. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Thu, Oct 26, 2017 at 4:29 PM, Ants Aasmawrote: > If the UI > were something like "WAITVISIBILITY token", then we can later change > the token to something other than LSN. That assumes, probably optimistically, that nobody will develop a dependency on it being, precisely, an LSN. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Mon, Oct 23, 2017 at 12:29 PM, Ivan Kartyshovwrote: > Ants Aasma писал 2017-09-26 13:00: >> >> Exposing this interface as WAITLSN will encode that visibility order >> matches LSN order. This removes any chance of fixing for example >> visibility order of async/vs/sync transactions. Just renaming it so >> the token is an opaque commit visibility token that just happens to be >> a LSN would still allow for progress in transaction management. For >> example, making PostgreSQL distributed will likely want timestamp >> and/or vector clock based visibility rules. > > > I'm sorry I did not understand exactly what you meant. > Please explain this in more detail. Currently transactions on the master become visible when xid is removed from proc array. This depends on the order of acquiring ProcArrayLock, which can happen in a different order from inserting the commit record to wal. Whereas on the standby the transactions will become visible in the same order that commit records appear in wal. The difference can be quite large when transactions are using different values for synchronous_commit. Fixing this is not easy, but we may want to do it someday. IIRC CSN threads contained more discussion on this topic. If we do fix it, it seems likely that what we need to wait on is not LSN, but some other kind of value. If the UI were something like "WAITVISIBILITY token", then we can later change the token to something other than LSN. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Tue, Sep 26, 2017 at 12:00 PM, Ants Aasmawrote: > Exposing this interface as WAITLSN will encode that visibility order > matches LSN order. That would be a bad thing to encode because it doesn't. Well... actually on the standby it does, and that's the only thing that matters in this case I guess. But I agree with you that's it's not a wonderful thing to bake into the UI, because we might want to change it some day. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Alexander Korotkov писал 2017-10-23 13:19: Despite code cleanup, you still have some random empty lines removals in your patch. I reconfigured my IDE to avoid this in the future. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companydiff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 4a6c99e..0d10117 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \ policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \ schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \ - vacuum.o vacuumlazy.o variable.o view.o + vacuum.o vacuumlazy.o variable.o view.o waitlsn.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/commands/waitlsn.c b/src/backend/commands/waitlsn.c new file mode 100644 index 000..db2f549 --- /dev/null +++ b/src/backend/commands/waitlsn.c @@ -0,0 +1,273 @@ +/*- + * + * waitlsn.c + * WaitLSN statment: WAITLSN + * + * Portions Copyright (c) 1996-2017, PostgreSQL Global Development
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Mon, Oct 23, 2017 at 12:42 PM, Ivan Kartyshovwrote: > New little cleanup code changes > Despite code cleanup, you still have some random empty lines removals in your patch. @@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = { > {NULL, 0, false} > }; > > - > /* > * Although only "on", "off", and "always" are documented, > * we accept all the likely variants of "on" and "off". @@ -141,7 +141,6 @@ > #include "utils/timestamp.h" > #include "utils/tqual.h" > > - > /* > * Maximum size of a NOTIFY payload, including terminating NULL. This > * must be kept small enough so that a notification message fits on one -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] make async slave to wait for lsn to be replayed
New little cleanup code changes -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b Author: i.kartyshovDate: Mon Oct 23 12:08:59 2017 +0300 Cherry picked and ported 11dev diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; - /* * Although only "on", "off", and "always" are documented, * we accept all the likely variants of "on" and "off". @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 4a6c99e..0d10117 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \ policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \ schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \ - vacuum.o vacuumlazy.o variable.o view.o + vacuum.o vacuumlazy.o variable.o view.o waitlsn.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c index f7de742..cdeddfc 100644 --- a/src/backend/commands/async.c +++
Re: [HACKERS] make async slave to wait for lsn to be replayed
Alexander Korotkov писал 2017-09-26 12:07: I propose following syntax options. WAITLSN lsn; WAITLSN lsn TIMEOUT delay; WAITLSN lsn INFINITE; WAITLSN lsn NOWAIT; For me that looks rather better. What do you think? I agree with you, now syntax looks better. New patch attached to tha mail. Ants Aasma писал 2017-09-26 13:00: Exposing this interface as WAITLSN will encode that visibility order matches LSN order. This removes any chance of fixing for example visibility order of async/vs/sync transactions. Just renaming it so the token is an opaque commit visibility token that just happens to be a LSN would still allow for progress in transaction management. For example, making PostgreSQL distributed will likely want timestamp and/or vector clock based visibility rules. I'm sorry I did not understand exactly what you meant. Please explain this in more detail. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b Author: i.kartyshovDate: Mon Oct 23 12:08:59 2017 +0300 Cherry picked and ported 11dev diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..6f389ca --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,144 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait for the target LSN to be replayed + + + + +WAITLSN 'LSN' [ INFINITELY ] +WAITLSN 'LSN' TIMEOUT wait_time +WAITLSN 'LSN' NOWAIT + + + + + Description + + + interprocess communication mechanism to wait for the target log sequence + number (LSN) on standby in + databases with master-standby asynchronous replication. When run with the + LSN option, the WAITLSN command + waits for the specified LSN to be replayed. By default, wait + time is unlimited. Waiting can be interrupted using Ctrl+C, or + by shutting down the postgres server. You can also limit the wait + time using the TIMEOUT option, or check the target LSN + status immediately using the NOWAIT option. + + + + + + Parameters + + + +LSN + + + Specify the target log sequence number to wait for. + + + + + INFINITELY + + + Wait until the target LSN is replayed on standby. + This is an optional parameter reinforcing the default behavior. + + + + + + + + TIMEOUT wait_time + + + Limit the time to wait for the LSN to be replayed. + The specified wait_time must be an integer + and is measured in milliseconds. + + + + + + NOWAIT + + + Report whether the target LSN has been replayed already, + without any waiting. + + + + + + + Examples + + + Run WAITLSN from psql, + limiting wait time to 1 milliseconds: + + +WAITLSN '0/3F07A6B1' TIMEOUT 1; +NOTICE: LSN is not reached. Try to increase wait time. +LSN reached +- + f +(1 row) + + + + + Wait until the specified LSN is replayed: + +WAITLSN '0/3F07A611'; +LSN reached +- + t +(1 row) + + + + + Limit LSN wait time to 50 milliseconds, and then cancel the command: + +WAITLSN '0/3F0FF791' TIMEOUT 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to increase wait time. +ERROR: canceling statement due to user request + LSN reached +- + f +(1 row) + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..117cc9b 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; - /* * Although only "on", "off", and "always" are documented, * we accept all the likely variants of "on" and "off". @@ -7312,6 +7312,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record =
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Tue, Aug 15, 2017 at 5:00 AM, Craig Ringerwrote: > On 22 March 2017 at 01:17, Robert Haas wrote: >> >> On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro >> wrote: >> > Maybe someone can think of a clever way for an extension to insert a >> > wait for a user-supplied LSN *before* acquiring a snapshot so it can >> > work for the higher levels, or maybe the hooks should go into core >> > PostgreSQL so that the extension can exist as an external project not >> > requiring a patched PostgreSQL installation, or maybe this should be >> > done with new core syntax that extends transaction commands. Do other >> > people have views on this? >> >> IMHO, trying to do this using a function-based interface is a really >> bad idea for exactly the reasons you mention. I don't see why we'd >> resist the idea of core syntax here; transactions are a core part of >> PostgreSQL. >> >> There is, of course, the question of whether making LSNs such a >> user-visible thing is a good idea in the first place, but that's a >> separate question from issue of what syntax for such a thing is best. > > > (I know this is old, but): > > That ship sailed a long time ago unfortunately, they're all over > pg_stat_replication and pg_replication_slots and so on. They're already > routinely used for monitoring replication lag in bytes, waiting for a peer > to catch up, etc. (continuing the trend of resurrecting old topics) Exposing this interface as WAITLSN will encode that visibility order matches LSN order. This removes any chance of fixing for example visibility order of async/vs/sync transactions. Just renaming it so the token is an opaque commit visibility token that just happens to be a LSN would still allow for progress in transaction management. For example, making PostgreSQL distributed will likely want timestamp and/or vector clock based visibility rules. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Mon, Jan 23, 2017 at 2:56 PM, Ivan Kartyshovwrote: > How to use it > == > WAITLSN ‘LSN’ [, timeout in ms]; > WAITLSN_INFINITE ‘LSN’; > WAITLSN_NO_WAIT ‘LSN’; Adding suffix to the command name looks weird. We don't do so for any other command. I propose following syntax options. WAITLSN lsn; WAITLSN lsn TIMEOUT delay; WAITLSN lsn INFINITE; WAITLSN lsn NOWAIT; For me that looks rather better. What do you think? -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] make async slave to wait for lsn to be replayed
I forget to include patch in last letter. Craig Ringer wrote 2017-08-15 05:00: That ship sailed a long time ago unfortunately, they're all over pg_stat_replication and pg_replication_slots and so on. They're already routinely used for monitoring replication lag in bytes, waiting for a peer to catch up, etc. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services Function pg_replication_slots is only master, and waitlsn is async hot standby replication function. It allows us to wait untill insert made on master is be replayed on replica. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com Russian Postgres Companydiff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 01acc2e..6792eb0 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..077e869 --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,119 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait until target LSN has been replayed + + + + +WAITLSN 'LSN' [ , delay ] +WAITLSN_INFINITE 'LSN' +WAITLSN_NO_WAIT 'LSN' + + + + + Description + + + The WAITLSN wait till target LSN will + be replayed with an optional delay (milliseconds by default + infinity) to be wait for LSN to replayed. + + + + WAITLSN provides a simple + interprocess LSN wait mechanism for a backends on slave + in master-slave replication scheme on PostgreSQL database. + + + + The WAITLSN_INFINITE wait till target LSN will + be replayed on slave . + + + + The WAITLSN_NO_WAIT will tell if target LSN was replayed without any wait. + + + + + Parameters + + + +LSN + + + Target log sequence number to be wait for. + + + + +delay + + + Time in miliseconds to waiting for LSN to be replayed. + + + + + + + + Notes + + + Delay time for waiting till LSN to be replayed must be integer. For + default it is infinity. Waiting can be interupped using Ctl+C, or + by Postmaster death. + + + + + Examples + + + Configure and execute a waitlsn from + psql: + + +WAITLSN '0/3F07A6B1', 1; +NOTICE: LSN is not reached. Try to make bigger delay. +WAITLSN + +WAITLSN '0/3F07A611'; +WAITLSN + +WAITLSN '0/3F0FF791', 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to make bigger delay. +ERROR: canceling statement due to user request + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9000b3a..0c5951a 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -209,6 +209,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index df4843f..9dfb59d 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -40,6 +40,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -147,7 +148,6 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; - /* * Although only "on", "off", and "always" are documented, * we accept all the likely variants of "on" and "off". @@ -7237,6 +7237,15 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN()) +{ + + WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr); +} + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 4a6c99e..0d10117 100644 --- a/src/backend/commands/Makefile +++ b/src/backend/commands/Makefile @@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \ policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \ schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \ tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \ - vacuum.o vacuumlazy.o variable.o view.o + vacuum.o vacuumlazy.o variable.o view.o waitlsn.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c index bacc08e..7d6011f 100644 --- a/src/backend/commands/async.c +++ b/src/backend/commands/async.c @@ -139,7 +139,6 @@ #include "utils/ps_status.h" #include "utils/timestamp.h" - /*
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hello, thank you for your comments over main idea and code. On 13.03.2017 05:20, Thomas Munro wrote: 1) First, I'll restate my view of the syntax-vs-function question: I think an fmgr function is the wrong place to do this, because it doesn't work for our 2 higher isolation levels as mentioned. Most people probably use READ COMMITTED most of the time so the extension version you've proposed is certainly useful for many people and I like it, but I will vote against inclusion in core of any feature that doesn't consider all three of our isolation levels, especially if there is no way to extend support to other levels later. I don't want PostgreSQL to keep adding features that eventually force everyone to use READ COMMITTED because they want to use feature X, Y or Z. Waiting for LSN is expected to be used on hot standby READ ONLY replication. Only READ COMMITTED and REPEATABLE READ, are allowed on hot standby. Maybe someone can think of a clever way for an extension to insert a wait for a user-supplied LSN *before* acquiring a snapshot so it can work for the higher levels, or maybe the hooks should go into core PostgreSQL so that the extension can exist as an external project not requiring a patched PostgreSQL installation, or maybe this should be done with new core syntax that extends transaction commands. Do other people have views on this? I think it is a good idea, but it is not clear for me, how to do it properly. 2) +wl_lsn_updated_hook(void) +{ +uint i; +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 +|| TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ Doesn't this mean that if you are waiting for LSN 1234, and the primary sends that LSN and then doesn't send anything for another hour, a standby waiting in pg_waitlsn is quite likely to skip that update (either because of count_waitlsn or interval_waitlsn), and then finish up waiting for a very long time? I'm not sure if this is a good idea, but it's an idea: You could keep your update skipping logic, but make sure you always catch the important case where recovery hits the end of WAL, by invoking your callback from WaitForWALToBecomeAvailable. It could have a boolean parameter that means 'don't skip this one!'. In other words, it's OK to skip updates, but not if you know there is no more WAL available to apply (since you have no idea how long it will be for more to arrive). Calling GetCurrentTimestamp() at high frequency (after every WAL record is replayed) may be a bad idea. It's a slow system call on some operating systems. Can we use an existing timestamp for free, like recoveryLastXTime? Remembering that the motivation for using this feature is to wait for *whole transactions* to be replayed and become visible, there is no sensible reason to wait for random WAL positions inside a transaction, so if you used that then you would effectively skip all non-COMMIT records and also skip some COMMIT records that are coming down the pipe too fast. Yes, I applied this idea and prepared new patch. 3) +static void +wl_own_latch(void) +{ +SpinLockAcquire(>l_arr[MyBackendId].slock); +OwnLatch(>l_arr[MyBackendId].latch); +is_latch_owned = true; + +if (state->backend_maxid < MyBackendId) +state->backend_maxid = MyBackendId; + +state->l_arr[MyBackendId].pid = MyProcPid; +SpinLockRelease(>l_arr[MyBackendId].slock); +} What is the point of using extra latches for this? Why not just use the standard latch? Syncrep and many other things do that. I'm not actually sure if there is ever a reason to create more latches in regular backends. SIGUSR1 will be delivered and set the main latch anyway. There are cases of specialised latches in the system, like the wal receiver latch, and I'm reliably informed that that solves problems like delivering a wakeup message without having to know which backend is currently the wal receiver (a problem we might consider solving today with a condition variable?) I don't think anything like that applies here. In my case I create a bunch of shared latches for each backend, I`ll think how to use standard latches in an efficient way. And about specialized latches on standby they are already in busy with wal replaying functions. 4) +for (i = 0; i <= state->backend_maxid; i++) +{ +SpinLockAcquire(>l_arr[i].slock); +if (state->l_arr[i].pid != 0) +SetLatch(>l_arr[i].latch); +SpinLockRelease(>l_arr[i].slock); +} Once we get through the update-skipping logic above, we hit this loop which acquires spinlocks for every backend one after another and sets the latches of every backend, no matter whether they are waiting for the LSN that has been applied. Assuming we go with this scan-the-whole-array approach, why not include the LSN waited for in the array slots,
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 22 March 2017 at 01:17, Robert Haaswrote: > On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro > wrote: > > Maybe someone can think of a clever way for an extension to insert a > > wait for a user-supplied LSN *before* acquiring a snapshot so it can > > work for the higher levels, or maybe the hooks should go into core > > PostgreSQL so that the extension can exist as an external project not > > requiring a patched PostgreSQL installation, or maybe this should be > > done with new core syntax that extends transaction commands. Do other > > people have views on this? > > IMHO, trying to do this using a function-based interface is a really > bad idea for exactly the reasons you mention. I don't see why we'd > resist the idea of core syntax here; transactions are a core part of > PostgreSQL. > > There is, of course, the question of whether making LSNs such a > user-visible thing is a good idea in the first place, but that's a > separate question from issue of what syntax for such a thing is best. (I know this is old, but): That ship sailed a long time ago unfortunately, they're all over pg_stat_replication and pg_replication_slots and so on. They're already routinely used for monitoring replication lag in bytes, waiting for a peer to catch up, etc. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 3/9/17 07:49, Ivan Kartyshov wrote: > Here I attached rebased patch waitlsn_10dev_v3 (core feature) > I will leave the choice of implementation (core/contrib) to the > discretion of the community. This patch is registered in the upcoming commit fest, but it needs to be rebased. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hi Ivan, On 3/21/17 1:06 PM, David Steele wrote: Hi Ivan, On 3/12/17 10:20 PM, Thomas Munro wrote: On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshovwrote: Here I attached rebased patch waitlsn_10dev_v3 (core feature) I will leave the choice of implementation (core/contrib) to the discretion of the community. Will be glad to hear your suggestion about syntax, code and patch. Hi Ivan, Here is some feedback based on a first read-through of the v4 patch. I haven't tested it yet. This thread has been idle for over a week. Please respond and/or post a new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback". This submission has been marked "Returned with Feedback". Please feel free to resubmit to a future commitfest. Regards, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munrowrote: > Maybe someone can think of a clever way for an extension to insert a > wait for a user-supplied LSN *before* acquiring a snapshot so it can > work for the higher levels, or maybe the hooks should go into core > PostgreSQL so that the extension can exist as an external project not > requiring a patched PostgreSQL installation, or maybe this should be > done with new core syntax that extends transaction commands. Do other > people have views on this? IMHO, trying to do this using a function-based interface is a really bad idea for exactly the reasons you mention. I don't see why we'd resist the idea of core syntax here; transactions are a core part of PostgreSQL. There is, of course, the question of whether making LSNs such a user-visible thing is a good idea in the first place, but that's a separate question from issue of what syntax for such a thing is best. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hi Ivan, On 3/12/17 10:20 PM, Thomas Munro wrote: On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshovwrote: Here I attached rebased patch waitlsn_10dev_v3 (core feature) I will leave the choice of implementation (core/contrib) to the discretion of the community. Will be glad to hear your suggestion about syntax, code and patch. Hi Ivan, Here is some feedback based on a first read-through of the v4 patch. I haven't tested it yet. This thread has been idle for over a week. Please respond and/or post a new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be marked "Returned with Feedback". Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshovwrote: > Here I attached rebased patch waitlsn_10dev_v3 (core feature) > I will leave the choice of implementation (core/contrib) to the discretion > of the community. > > Will be glad to hear your suggestion about syntax, code and patch. Hi Ivan, Here is some feedback based on a first read-through of the v4 patch. I haven't tested it yet. First, I'll restate my view of the syntax-vs-function question: I think an fmgr function is the wrong place to do this, because it doesn't work for our 2 higher isolation levels as mentioned. Most people probably use READ COMMITTED most of the time so the extension version you've proposed is certainly useful for many people and I like it, but I will vote against inclusion in core of any feature that doesn't consider all three of our isolation levels, especially if there is no way to extend support to other levels later. I don't want PostgreSQL to keep adding features that eventually force everyone to use READ COMMITTED because they want to use feature X, Y or Z. Maybe someone can think of a clever way for an extension to insert a wait for a user-supplied LSN *before* acquiring a snapshot so it can work for the higher levels, or maybe the hooks should go into core PostgreSQL so that the extension can exist as an external project not requiring a patched PostgreSQL installation, or maybe this should be done with new core syntax that extends transaction commands. Do other people have views on this? + * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group This should say PostgreSQL. +wl_lsn_updated_hook(void) +{ +uint i; +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 +|| TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ Doesn't this mean that if you are waiting for LSN 1234, and the primary sends that LSN and then doesn't send anything for another hour, a standby waiting in pg_waitlsn is quite likely to skip that update (either because of count_waitlsn or interval_waitlsn), and then finish up waiting for a very long time? I'm not sure if this is a good idea, but it's an idea: You could keep your update skipping logic, but make sure you always catch the important case where recovery hits the end of WAL, by invoking your callback from WaitForWALToBecomeAvailable. It could have a boolean parameter that means 'don't skip this one!'. In other words, it's OK to skip updates, but not if you know there is no more WAL available to apply (since you have no idea how long it will be for more to arrive). Calling GetCurrentTimestamp() at high frequency (after every WAL record is replayed) may be a bad idea. It's a slow system call on some operating systems. Can we use an existing timestamp for free, like recoveryLastXTime? Remembering that the motivation for using this feature is to wait for *whole transactions* to be replayed and become visible, there is no sensible reason to wait for random WAL positions inside a transaction, so if you used that then you would effectively skip all non-COMMIT records and also skip some COMMIT records that are coming down the pipe too fast. +static void +wl_own_latch(void) +{ +SpinLockAcquire(>l_arr[MyBackendId].slock); +OwnLatch(>l_arr[MyBackendId].latch); +is_latch_owned = true; + +if (state->backend_maxid < MyBackendId) +state->backend_maxid = MyBackendId; + +state->l_arr[MyBackendId].pid = MyProcPid; +SpinLockRelease(>l_arr[MyBackendId].slock); +} What is the point of using extra latches for this? Why not just use the standard latch? Syncrep and many other things do that. I'm not actually sure if there is ever a reason to create more latches in regular backends. SIGUSR1 will be delivered and set the main latch anyway. There are cases of specialised latches in the system, like the wal receiver latch, and I'm reliably informed that that solves problems like delivering a wakeup message without having to know which backend is currently the wal receiver (a problem we might consider solving today with a condition variable?) I don't think anything like that applies here. +for (i = 0; i <= state->backend_maxid; i++) +{ +SpinLockAcquire(>l_arr[i].slock); +if (state->l_arr[i].pid != 0) +SetLatch(>l_arr[i].latch); +SpinLockRelease(>l_arr[i].slock); +} Once we get through the update-skipping logic above, we hit this loop which acquires spinlocks for every backend one after another and sets the latches of every backend, no matter whether they are waiting for the LSN that has been applied. Assuming we go with this scan-the-whole-array approach, why not include the LSN waited for in the array slots, so that we can avoid disturbing processes that are waiting for a later LSN? Could you talk a bit about
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hello On 07.03.2017 15:58, Masahiko Sawada wrote: I've read the discussion so far but I didn't see the reason why you've changed it to as a contrib module. Could you tell me about that? I did it on the initiative of our customer, who preferred the implementation in the form of contrib. Contrib realization of WAITLSN adds to core the only hook. On 07.03.2017 15:58, Masahiko Sawada wrote: > I guess this feature would be more useful if provided as a core > feature and we need to discuss about syntax as Thomas mentioned. Here I attached rebased patch waitlsn_10dev_v3 (core feature) I will leave the choice of implementation (core/contrib) to the discretion of the community. Will be glad to hear your suggestion about syntax, code and patch. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 2bc4d9f..6d5a81e 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -178,6 +178,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml new file mode 100644 index 000..338187b --- /dev/null +++ b/doc/src/sgml/ref/waitlsn.sgml @@ -0,0 +1,134 @@ + + + + + WAITLSN + + + + WAITLSN + 7 + SQL - Language Statements + + + + WAITLSN + wait until target LSN has been replayed + + + + +WAITLSN 'LSN' [ , delay ] +WAITLSN_INFINITE 'LSN' +WAITLSN_NO_WAIT 'LSN' + + + + + Description + + + The WAITLSN wait till target LSN will + be replayed with an optional delay (milliseconds by default + infinity) to be wait for LSN to replayed. + + + + WAITLSN provides a simple + interprocess LSN wait mechanism for a backends on slave + in master-slave replication scheme on PostgreSQL database. + + + + The WAITLSN_INFINITE wait till target LSN will + be replayed on slave . + + + + The WAITLSN_NO_WAIT will tell if target LSN was replayed without any wait. + + + + + Parameters + + + +LSN + + + Target log sequence number to be wait for. + + + + +delay + + + Time in miliseconds to waiting for LSN to be replayed. + + + + + + + + Notes + + + Delay time for waiting till LSN to be replayed must be integer. For + default it is infinity. Waiting can be interupped using Ctl+C, or + by Postmaster death. + + + + + GUCs + + + Add two GUCs which help tuning influence on StartupXLOG: + count_waitlsn (denominator to check not each LSN) + int count_waitlsn= 10; + + + + interval_waitlsn (Interval in milliseconds to additional LSN check) + int interval_waitlsn = 100; + + + + + Examples + + + Configure and execute a waitlsn from + psql: + + +WAITLSN '0/3F07A6B1', 1; +NOTICE: LSN is not reached. Try to make bigger delay. +WAITLSN + +WAITLSN '0/3F07A611'; +WAITLSN + +WAITLSN '0/3F0FF791', 50; +^CCancel request sent +NOTICE: LSN is not reached. Try to make bigger delay. +ERROR: canceling statement due to user request + + + + + + Compatibility + + + There is no WAITLSN statement in the SQL + standard. + + + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index c8191de..a5100a2 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -206,6 +206,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 2dcff7f..9c2def1 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -39,6 +39,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -145,6 +146,9 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; /* * Although only "on", "off", and "always" are documented, @@ -6948,6 +6952,8 @@ StartupXLOG(void) { ErrorContextCallback errcallback; TimestampTz xtime; + TimestampTz time_waitlsn = GetCurrentTimestamp(); + int counter_waitlsn = 0; InRedo = true; @@ -7174,6 +7180,17 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 + || TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ + WaitLSNSetLatch(); + time_waitlsn = GetCurrentTimestamp(); +} +counter_waitlsn++; + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index e0fab38..8a7e2bd 100644 ---
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Wed, Mar 8, 2017 at 1:58 AM, Masahiko Sawadawrote: > On Tue, Mar 7, 2017 at 8:48 PM, Ivan Kartyshov > wrote: >> Rebase done. > > Thank you for updating the patch. > >> >> Meanwhile I made some more changes. >> >> Changes >> === >> 1) WAITLSN is now implemented as an extension called "pg_waitlsn" > > I've read the discussion so far but I didn't see the reason why you've > changed it to as a contrib module. Could you tell me about that? I > guess this feature would be more useful if provided as a core feature > and we need to discuss about syntax as Thomas mentioned. The problem with using functions like pg_waitlsn(‘LSN’ [, timeout in ms]) instead of new syntax for transaction starting commands like BEGIN TRANSACTION ... WAIT FOR ... is that it doesn't work for the higher isolation levels. In READ COMMITTED it's fine, because every statement runs with its own snapshot, so when SELECT pg_waitlsn(some_lsn) returns, the next statement will run with a snapshot that can see the effects of some_lsn being applied. But in REPEATABLE READ and SERIALIZABLE, even though pg_waitlsn(some_lsn) waits for the LSN to be applied, the next statement will run with the snapshot from before and never see the transaction you were waiting for. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Tue, Mar 7, 2017 at 8:48 PM, Ivan Kartyshovwrote: > Rebase done. Thank you for updating the patch. > > Meanwhile I made some more changes. > > Changes > === > 1) WAITLSN is now implemented as an extension called "pg_waitlsn" I've read the discussion so far but I didn't see the reason why you've changed it to as a contrib module. Could you tell me about that? I guess this feature would be more useful if provided as a core feature and we need to discuss about syntax as Thomas mentioned. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Rebase done. Meanwhile I made some more changes. Changes === 1) WAITLSN is now implemented as an extension called "pg_waitlsn" 2) Call new hook "lsn_updated_hook" right after xact_redo_commit (xlog.c) 3) Corresponding functions: pg_waitlsn('0/693FF800', 1) - wait 10 seconds pg_waitlsn_infinite('0/693FF800') - for infinite wait pg_waitlsn_no_wait('0/693FF800') - once check if LSN was replayed or not. 4) Add two GUCs which help tuning influence on StartupXLOG: count_waitlsn (denominator to check not each LSN) int count_waitlsn= 10; interval_waitlsn (Interval in milliseconds to additional LSN check) int interval_waitlsn = 100; 5) Optimize loop that set latches. How to use it == Master: 1) Make "wal_level = replica" Slave: 2) Add shared_preload_libraries = 'pg_waitlsn' hot_standby = on (in postgresql.conf) 3) Create extension pg_waitlsn; 4) And in hot_standby you can wait for LSN (pgsleep), when LSN will replayed on slave pg_waitlsn will release select pg_waitlsn(‘LSN’ [, timeout in ms]); select pg_waitlsn_infinite(‘LSN’); select pg_waitlsn_no_wait(‘LSN’); #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. select pg_waitlsn(‘0/303EC60’, 1); #Or same without timeout. select pg_waitlsn(‘0/303EC60’); select pg_waitlsn_infinite('0/693FF800'); #To check if LSN is replayed can be used. select pg_waitlsn_no_wait('0/693FF800'); Notice: select pg_waitlsn will release on PostmasterDeath or Interruption events if they come earlier then target LSN or timeout. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/contrib/pg_waitlsn/Makefile b/contrib/pg_waitlsn/Makefile new file mode 100644 index 000..49a326c --- /dev/null +++ b/contrib/pg_waitlsn/Makefile @@ -0,0 +1,21 @@ +# pg_waitlsn/Makefile + +MODULE_big = pg_waitlsn +OBJS = pg_waitlsn.o +EXTENSION = pg_waitlsn +DATA = pg_waitlsn--1.0.sql + + + +ifdef USE_PGXS + + PG_CONFIG = pg_config + PGXS := $( shell $( PG_CONFIG ) --pgxs ) + include $(PGXS) +else + + subdir = contrib/pg_waitlsn + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/pg_waitlsn/pg_waitlsn--1.0.sql b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql new file mode 100644 index 000..8b251f3 --- /dev/null +++ b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql @@ -0,0 +1,19 @@ +/* contrib/pg_waitlsn/pg_waitlsn--1.0.sql */ + +-- complain if script is sourced in psql, rather than via CREATE EXTENSION +\echo Use "CREATE EXTENSION pg_waitlsn" to wait target LSN to been replayed, delay for waiting in miliseconds (default infinity) \quit + +CREATE FUNCTION pg_waitlsn(lsn pg_lsn, delay int default 0) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn' +LANGUAGE C IMMUTABLE STRICT ; + +CREATE FUNCTION pg_waitlsn_infinite(lsn pg_lsn) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn_infinite' +LANGUAGE C IMMUTABLE STRICT ; + +CREATE FUNCTION pg_waitlsn_no_wait(lsn pg_lsn) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_waitlsn_no_wait' +LANGUAGE C IMMUTABLE STRICT ; diff --git a/contrib/pg_waitlsn/pg_waitlsn.c b/contrib/pg_waitlsn/pg_waitlsn.c new file mode 100644 index 000..d210678 --- /dev/null +++ b/contrib/pg_waitlsn/pg_waitlsn.c @@ -0,0 +1,299 @@ +/*- + * + * pg_waitlsn + * + * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group + * + * IDENTIFICATION + * contrib/pg_waitlsn/pg_waitlsn.c + * + *- + */ + +#include "postgres.h" +#include "fmgr.h" +#include "pgstat.h" +#include "access/xlog.h" +#include "utils/pg_lsn.h" +#include "storage/latch.h" +#include "miscadmin.h" +#include "storage/spin.h" +#include "storage/backendid.h" +#include "access/xact.h" +#include "storage/shmem.h" +#include "storage/ipc.h" +#include "utils/timestamp.h" +#include "storage/pmsignal.h" +#include "storage/proc.h" +#include "access/transam.h" +#include "utils/guc.h" + +PG_MODULE_MAGIC; + +static bool pg_waitlsn_internal(XLogRecPtr lsn, uint64_t delay); + +/* Hooks values */ +static lsn_updated_hook_type prev_lsn_updated_hook = NULL; +static shmem_startup_hook_type prev_shmem_startup_hook = NULL; +static void wl_lsn_updated_hook(void); +static uint32 estimate_shmem_size(void); + +/* Latches Own-DisownLatch and AbortCаllBack */ +static void disown_latches_on_abort(XactEvent event, void *arg); +static void wl_own_latch(void); +static void wl_disown_latch(void); + +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; + +/* Globals */ +TimestampTz time_waitlsn = 0; +intcounter_waitlsn = 0; + +void _PG_init(void); + +/* Shared memory structures */ +typedef struct +{ + int pid; + volatile slock_t slock; + Latchlatch; +} BIDLatch; + +typedef struct +{ + char dummy; + int backend_maxid; + BIDLatch l_arr[FLEXIBLE_ARRAY_MEMBER]; +} GlobState;
Re: [HACKERS] make async slave to wait for lsn to be replayed
Hi Ivan, On 2/27/17 3:52 PM, Thomas Munro wrote: > On Thu, Feb 23, 2017 at 3:08 AM, Thom Brownwrote: >> On 23 January 2017 at 11:56, Ivan Kartyshov >> wrote: >>> >>> Thank you for reading, will be glad to get your feedback. >> >> Could you please rebase your patch as it no longer applies cleanly. > > +1 Please provide a rebased patch as soon as possible. Thanks, -- -David da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Thu, Feb 23, 2017 at 3:08 AM, Thom Brownwrote: > On 23 January 2017 at 11:56, Ivan Kartyshov > wrote: >> >> Thank you for reading, will be glad to get your feedback. > > Could you please rebase your patch as it no longer applies cleanly. +1 -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 23 January 2017 at 11:56, Ivan Kartyshovwrote: > Thank you for reviews and suggested improvements. > I rewrote patch to make it more stable. > > Changes > === > I've made a few changes: > 1) WAITLSN now doesn`t depend on snapshot > 2) Check current replayed LSN rather than in xact_redo_commit > 3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and > WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you > advised. > 4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5 > doesn`t exist). > 5) Optimize loop that set latches. > 6) Add two GUCs that helps us to configure influence on StartupXLOG: > count_waitlsn (denominator to check not each LSN) > interval_waitlsn (Interval in milliseconds to additional LSN check) > > Feedback > > On 09/15/2016 05:41 AM, Thomas Munro wrote: >> >> You hold a spinlock in one arbitrary slot, but that >> doesn't seem sufficient: another backend may also read it, compute a >> new value and then write it, while holding a different spin lock. Or >> am I missing something? > > > We acquire an individual spinlock on each member of array, so you cannot > compute new value and write it concurrently. > > Tested > == > We have been tested it on different servers and OS`s, in different cases and > workloads. New version is nearly as fast as vanilla on primary and bring > tiny influence on standby performance. > > Hardware: > 144 Intel Cores with HT > 3TB RAM > all data on ramdisk > primary + hotstandby on the same node. > > A dataset was created with "pgbench -i -s 1000" command. For each round of > test we pause replay on standby, make 100 transaction on primary with > pgbench, start replay on standby and measure replication gap disappearing > time under different standby workload. The workload was "WAITLSN > ('Very/FarLSN', 1000ms timeout)" followed by "select abalance from > pgbench_accounts there aid = random_aid;" > For vanilla 1000ms timeout was enforced on pgbench side by -R option. > GUC waitlsn parameters was adopted for 1000ms timeout on standby with 35000 > tps rate on primary. > interval_waitlsn = 500 (ms) > count_waitlsn = 3 > > On 200 clients, slave caching up master as vanilla without significant > delay. > On 500 clients, slave caching up master 3% slower then vanilla. > On 1000 clients, 12% slower. > On 5000 clients, 3 time slower because it far above our hardware ability. > > How to use it > == > WAITLSN ‘LSN’ [, timeout in ms]; > WAITLSN_INFINITE ‘LSN’; > WAITLSN_NO_WAIT ‘LSN’; > > #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. > WAITLSN ‘0/303EC60’, 1; > > #Or same without timeout. > WAITLSN ‘0/303EC60’; > orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch > WAITLSN_INFINITE '0/693FF800'; > > #To check if LSN is replayed can be used. > WAITLSN_NO_WAIT '0/693FF800'; > > Notice: WAITLSN will release on PostmasterDeath or Interruption events > if they come earlier then target LSN or timeout. > > > Thank you for reading, will be glad to get your feedback. Could you please rebase your patch as it no longer applies cleanly. Thanks Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
Thank you for reviews and suggested improvements. I rewrote patch to make it more stable. Changes === I've made a few changes: 1) WAITLSN now doesn`t depend on snapshot 2) Check current replayed LSN rather than in xact_redo_commit 3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you advised. 4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5 doesn`t exist). 5) Optimize loop that set latches. 6) Add two GUCs that helps us to configure influence on StartupXLOG: count_waitlsn (denominator to check not each LSN) interval_waitlsn (Interval in milliseconds to additional LSN check) Feedback On 09/15/2016 05:41 AM, Thomas Munro wrote: You hold a spinlock in one arbitrary slot, but that doesn't seem sufficient: another backend may also read it, compute a new value and then write it, while holding a different spin lock. Or am I missing something? We acquire an individual spinlock on each member of array, so you cannot compute new value and write it concurrently. Tested == We have been tested it on different servers and OS`s, in different cases and workloads. New version is nearly as fast as vanilla on primary and bring tiny influence on standby performance. Hardware: 144 Intel Cores with HT 3TB RAM all data on ramdisk primary + hotstandby on the same node. A dataset was created with "pgbench -i -s 1000" command. For each round of test we pause replay on standby, make 100 transaction on primary with pgbench, start replay on standby and measure replication gap disappearing time under different standby workload. The workload was "WAITLSN ('Very/FarLSN', 1000ms timeout)" followed by "select abalance from pgbench_accounts there aid = random_aid;" For vanilla 1000ms timeout was enforced on pgbench side by -R option. GUC waitlsn parameters was adopted for 1000ms timeout on standby with 35000 tps rate on primary. interval_waitlsn = 500 (ms) count_waitlsn = 3 On 200 clients, slave caching up master as vanilla without significant delay. On 500 clients, slave caching up master 3% slower then vanilla. On 1000 clients, 12% slower. On 5000 clients, 3 time slower because it far above our hardware ability. How to use it == WAITLSN ‘LSN’ [, timeout in ms]; WAITLSN_INFINITE ‘LSN’; WAITLSN_NO_WAIT ‘LSN’; #Wait until LSN 0/303EC60 will be replayed, or 10 second passed. WAITLSN ‘0/303EC60’, 1; #Or same without timeout. WAITLSN ‘0/303EC60’; orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch WAITLSN_INFINITE '0/693FF800'; #To check if LSN is replayed can be used. WAITLSN_NO_WAIT '0/693FF800'; Notice: WAITLSN will release on PostmasterDeath or Interruption events if they come earlier then target LSN or timeout. Thank you for reading, will be glad to get your feedback. -- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 77667bdebd..72c5390695 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 8acdff1393..3733ad960b 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -200,6 +200,7 @@ + diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index aa9ee5a0dd..9696b5dbb5 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -39,6 +39,7 @@ #include "catalog/pg_control.h" #include "catalog/pg_database.h" #include "commands/tablespace.h" +#include "commands/waitlsn.h" #include "miscadmin.h" #include "pgstat.h" #include "port/atomics.h" @@ -143,6 +144,9 @@ const struct config_enum_entry sync_method_options[] = { {NULL, 0, false} }; +/* GUC variable */ +intcount_waitlsn = 10; +intinterval_waitlsn = 100; /* * Although only "on", "off", and "always" are documented, @@ -6781,6 +6785,8 @@ StartupXLOG(void) { ErrorContextCallback errcallback; TimestampTz xtime; + TimestampTz time_waitlsn = GetCurrentTimestamp(); + int counter_waitlsn = 0; InRedo = true; @@ -6998,6 +7004,17 @@ StartupXLOG(void) break; } +/* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ +if (counter_waitlsn % count_waitlsn == 0 + || TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn)) +{ + WaitLSNSetLatch(); + time_waitlsn = GetCurrentTimestamp(); +} +counter_waitlsn++; + /* Else, try to fetch the next WAL record */ record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false); } while (record != NULL); diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile index 6b3742c0a0..091cbe22a0 100644
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Thu, Sep 15, 2016 at 2:41 PM, Thomas Munrowrote: > On Thu, Sep 1, 2016 at 2:16 AM, Ivan Kartyshov > wrote: >> Hi hackers, >> >> Few days earlier I've finished my work on WAITLSN statement utility, so I’d >> like to share it. >> [...] >> Your feedback is welcome! >> >> [waitlsn_10dev.patch] > > Thanks for working on this. Here are some general thoughts on the > feature, and an initial review. Hi Ivan I'm marking the patch Returned with Feedback, since there hasn't been any response or a new patch. I encourage you to keep working on this feature, and I'll be happy to review future patches. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Thu, Sep 1, 2016 at 2:16 AM, Ivan Kartyshovwrote: > Hi hackers, > > Few days earlier I've finished my work on WAITLSN statement utility, so I’d > like to share it. > [...] > Your feedback is welcome! > > [waitlsn_10dev.patch] Hi Ivan, Thanks for working on this. Here are some general thoughts on the feature, and an initial review. +1 for this feature. Explicitly waiting for a given commit to be applied is one of several approaches to achieve "causal consistency" for reads on replica nodes, and I think it will be very useful if combined with a convenient way to get the values to wait for when you run COMMIT. This could be used either by applications directly, or by middleware that somehow keeps track of dependencies between transactions and inserts waits. I liked the way Heikki Linnakangas imagined this feature[1]: BEGIN WAIT FOR COMMIT 1234 TO BE VISIBLE; ... or perhaps it could be spelled like this: BEGIN [isolation stuff] WAIT FOR COMMIT TOKEN TIMEOUT ; That allows waiting only at the start of a transaction, whereas your idea of making a utility command would allow a single READ COMMITTED transaction to wait multiple times for transactions it has heard about through side channels, which may be useful. Perhaps we could support the same syntax in a stand alone statement inside a transaction OR as part of a BEGIN ... statement. Being able to do it as part of BEGIN means that you can use this feature for single-snapshot transactions, ie REPEATABLE READ and SERIALIZABLE (of course you can't use SERIALIZABLE on hot standbys yet but that'll be fixed one day). Otherwise you'd be waiting for the LSN in the middle of your transaction but not be able to see the result because you don't take a new snapshot. Or maybe it's enough to use a standalone WAIT ... statement inside a REPEATABLE READ or SERIALIZABLE transaction as long as it's the first statement, and should be an error to do so any time later? I think working in terms of LSNs or XIDs explicitly is not a good idea: encouraging clients to think in terms of anything other than opaque 'commit tokens' seems like a bad idea because it limits future changes. For example, there is on-going discussion about introducing CSNs (commit sequence numbers), and there are some related concepts lurking in the SSI code; maybe we'd want to use those one day. Do you think it would make sense to have a concept of a commit token that is a non-analysable string as far as clients are concerned, so that clients are not encouraged to do anything at all with them except use them in a WAIT FOR COMMIT TOKEN statement? INITIAL FEEDBACK ON THE PATCH I didn't get as far as testing or detailed review because it has some obvious bugs and compiler warnings which I figured we should talk about first, and I also have some higher level questions about the design. gram.y:12882:15: error: assignment makes pointer from integer without a cast [-Werror=int-conversion] n->delay = $3; It looks like struct WaitLSNStmt accidentally has 'delay' as a pointer to int. Perhaps you want an int? Maybe it would be useful to include the units (millisecond, ms) in the name? waitlsn.c: In function 'WLDisownLatch': waitlsn.c:82:2: error: suggest parentheses around assignment used as truth value [-Werror=parentheses] if (MyBackendId = state->backend_maxid) ^~ Pretty sure you want == here. waitlsn.c: In function 'WaitLSNUtility': waitlsn.c:153:17: error: initialization makes integer from pointer without a cast [-Werror=int-conversion] int tdelay = delay; ^ Another place where I think you wanted an int but used a pointer to int? To fix that warning you need tdelay = *delay, but I think delay should really not be taken by pointer at all. @@ -6922,6 +6923,11 @@ StartupXLOG(void) + /* + * After update lastReplayedEndRecPtr set Latches in SHMEM array + */ + WaitLSNSetLatch(); + I think you should try to do this only after commit records are replayed, not after every record. Only commit records can make transactions visible, and the raison d'être for this feature is to let users wait for transactions they know about to become visible. You probably can't do it directly in xact_redo_commit though, because at that point XLogCtl->lastReplayedEndRecPtr hasn't been updated yet so a backend that wakes up might not see that it has advanced and go back to sleep. It is updated in the StartupXLOG loop after the redo function runs. That is the reason why WalRcvForceReply() is called from there rather than in xact_redo_commit, to implement remote_apply for replication. Perhaps you need something similar? + tdelay -= (GetCurrentTimestamp() - timer); You can't do arithmetic with TimestampTz like this. Depending on configure option --disable-integer-datetimes (which controls macro HAVE_INT64_TIMESTAMP), it may be a floating point number of seconds since the epoch, or an integer number of microseconds since the epoch. It looks
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 08/31/2016 05:54 PM, Craig Ringer wrote: How do you get the commit LSN to watch for? Grab pg_current_xlog_insert_location() just after the commit and figure that replaying to that point guarantees you get the commit? That's the point, it was created in order to provide the cosistent view of data between master and replica. You almost guessed, I used GetXLogReplayRecPtr() right after LSN was physically replayed on downstream. Some time ago[1] I raised the idea of reporting commit LSN on the wire to clients. That didn't go anywhere due to compatibility and security concerns. I think those were resolvable, but it wasn't enough of a priority to push hard on at the time. A truly "right" solution has to wait for a protocol bump, but I think good-enough solutions are possible now. So you might want to read that thread. Thank you for pointing to your thread, it was very informative! It seems that you have solved the very similar problem. It also mentions hesitations about exposing LSN to clients even more. I think we're *way* past that now - we have replication origins and replication slots relying on it, it's exposed in a pg_lsn datatype, a bunch of views expose it, etc. But it might be reasonable to ask "should the client instead be expected to wait for the confirmed commit of a 64-bit epoch-extended xid, like that returned by txid_current()?" . One advantage of using xid is that you can get it while you're still in the xact, so there's no race between commit and checking the lsn after commit. That sounds reasonable, but I dont think it will give us any considerable benefits. But I`ll work out this variant. Are you specifically trying to ensure "this commit has replayed on the replica before we run queries on it" ? Or something else? Yes you are right, I want to ensure data consistency on downstream before running queries on it. Our clients would use it as a part of background worker and maybe directly in apps too. --- Ivan Kartyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] make async slave to wait for lsn to be replayed
On 31 August 2016 at 22:16, Ivan Kartyshovwrote: > Our clients who deal with 9.5 and use asynchronous master-slave replication, > asked to make the wait-mechanism on the slave side to prevent the situation > when slave handles query which needs data (LSN) that was received, flushed, > but still not replayed. I like the broad idea - I've wanted something like it for a while. BDR has pg_xlog_wait_remote_receive() and pg_xlog_wait_remote_apply() for use in tests for this reason, but they act on the *upstream* side, waiting until the downstream has acked the data. Not as useful for ensuring that apps connected to both master and one or more replicas get a consistent view of data. How do you get the commit LSN to watch for? Grab pg_current_xlog_insert_location() just after the commit and figure that replaying to that point guarantees you get the commit? Some time ago[1] I raised the idea of reporting commit LSN on the wire to clients. That didn't go anywhere due to compatibility and security concerns. I think those were resolvable, but it wasn't enough of a priority to push hard on at the time. A truly "right" solution has to wait for a protocol bump, but I think good-enough solutions are possible now. So you might want to read that thread. It also mentions hesitations about exposing LSN to clients even more. I think we're *way* past that now - we have replication origins and replication slots relying on it, it's exposed in a pg_lsn datatype, a bunch of views expose it, etc. But it might be reasonable to ask "should the client instead be expected to wait for the confirmed commit of a 64-bit epoch-extended xid, like that returned by txid_current()?" . One advantage of using xid is that you can get it while you're still in the xact, so there's no race between commit and checking the lsn after commit. Are you specifically trying to ensure "this commit has replayed on the replica before we run queries on it" ? Or something else? (Also, on a side note, Kevin mentioned that it may be possible to use SSI data to achieve SERIALIZABLE read-only queries on replicas, where they get the same protection from commit-order related anomalies as queries on the master. You might want to look more deeply into that too at some stage, if you're trying to ensure the app can do read only queries on the master and expect fully consistent results). [1] https://www.postgresql.org/message-id/flat/53E41EC1.5050603%402ndquadrant.com#53e41ec1.5050...@2ndquadrant.com -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers