Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-31 Thread Ants Aasma
On Mon, Oct 30, 2017 at 7:25 PM, Ivan Kartyshov
 wrote:
> It sounds reasonable. I can offer the following version.
>
> WAIT  LSN  lsn_number;
> WAIT  LSN  lsn_number  TIMEOUT delay;
> WAIT  LSN  lsn_number  INFINITE;
> WAIT  LSN  lsn_number  NOWAIT;
>
>
> WAIT  [token]  wait_value [option];
>
> token - [LSN, TIME | TIMESTAMP | CSN | XID]
> option - [TIMEOUT | INFINITE | NOWAIT]
>
> Ready to listen to your suggestions.

Making the interface more specific about the mechanism is not what I
had in mind, quite the opposite. I would like to see the interface
describe the desired effect of the wait.

Stepping back for a while, from what I understand the reason we want
to waiting is to prevent observation of database state going
backwards. To limit the amount of waiting needed we tell the database
what we have observed. For example "I just observed my transaction
commit", or "the last time I observed state was X", and then have the
database provide us with a snapshot that is causally dependent on
those states. This does not give us linearizability, for that we still
need at the very least serializable transactions on standby. But it
seems to provide a form of sequential consistency, which (if it can be
proved to hold) makes reasoning about concurrency a lot nicer.

For lack of a better proposal I would like something along the lines of:

WAIT FOR state_id[, state_id] [ OPTIONS (..)]

And to get the tokens maybe a function pg_last_commit_state().

Optionally, to provide read-to-read causality, pg_snapshot_state()
could return for example replay_lsn at the start of the current
transaction. This makes sure that things don't just appear and
disappear when load balancing across many standby servers.

WAIT FOR semantics is to ensure that next snapshot is causally
dependent (happens after) each of the specified observed states.

The state_id could simply be a LSN, or to allow for future expansion
something like 'lsn:/1234'. Example extension would be to allow
for waiting on xids. On master that would be just a ShareLock on the
transactionid. On standby it would wait for the commit or rollback
record for that transaction to be replayed.

Robert made a good point that people will still rely on the token
being an LSN, but perhaps they will be slightly less angry when we
explicitly tell them that this might change in the future.

Regards,
Ants Aasma

[1] 
https://www.postgresql.org/docs/devel/static/functions-admin.html#functions-snapshot-synchronization


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-30 Thread Ivan Kartyshov

Ants Aasma писал 2017-10-26 17:29:

On Mon, Oct 23, 2017 at 12:29 PM, Ivan Kartyshov
 wrote:

Ants Aasma писал 2017-09-26 13:00:


Exposing this interface as WAITLSN will encode that visibility order
matches LSN order. This removes any chance of fixing for example
visibility order of async/vs/sync transactions. Just renaming it so
the token is an opaque commit visibility token that just happens to 
be

a LSN would still allow for progress in transaction management. For
example, making PostgreSQL distributed will likely want timestamp
and/or vector clock based visibility rules.



I'm sorry I did not understand exactly what you meant.
Please explain this in more detail.


Currently transactions on the master become visible when xid is
removed from proc array. This depends on the order of acquiring
ProcArrayLock, which can happen in a different order from inserting
the commit record to wal. Whereas on the standby the transactions will
become visible in the same order that commit records appear in wal.
The difference can be quite large when transactions are using
different values for synchronous_commit. Fixing this is not easy, but
we may want to do it someday. IIRC CSN threads contained more
discussion on this topic. If we do fix it, it seems likely that what
we need to wait on is not LSN, but some other kind of value. If the UI
were something like "WAITVISIBILITY token", then we can later change
the token to something other than LSN.

Regards,
Ants Aasma


It sounds reasonable. I can offer the following version.

WAIT  LSN  lsn_number;
WAIT  LSN  lsn_number  TIMEOUT delay;
WAIT  LSN  lsn_number  INFINITE;
WAIT  LSN  lsn_number  NOWAIT;


WAIT  [token]  wait_value [option];

token - [LSN, TIME | TIMESTAMP | CSN | XID]
option - [TIMEOUT | INFINITE | NOWAIT]

Ready to listen to your suggestions.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-26 Thread Robert Haas
On Thu, Oct 26, 2017 at 4:29 PM, Ants Aasma  wrote:
> If the UI
> were something like "WAITVISIBILITY token", then we can later change
> the token to something other than LSN.

That assumes, probably optimistically, that nobody will develop a
dependency on it being, precisely, an LSN.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-26 Thread Ants Aasma
On Mon, Oct 23, 2017 at 12:29 PM, Ivan Kartyshov
 wrote:
> Ants Aasma писал 2017-09-26 13:00:
>>
>> Exposing this interface as WAITLSN will encode that visibility order
>> matches LSN order. This removes any chance of fixing for example
>> visibility order of async/vs/sync transactions. Just renaming it so
>> the token is an opaque commit visibility token that just happens to be
>> a LSN would still allow for progress in transaction management. For
>> example, making PostgreSQL distributed will likely want timestamp
>> and/or vector clock based visibility rules.
>
>
> I'm sorry I did not understand exactly what you meant.
> Please explain this in more detail.

Currently transactions on the master become visible when xid is
removed from proc array. This depends on the order of acquiring
ProcArrayLock, which can happen in a different order from inserting
the commit record to wal. Whereas on the standby the transactions will
become visible in the same order that commit records appear in wal.
The difference can be quite large when transactions are using
different values for synchronous_commit. Fixing this is not easy, but
we may want to do it someday. IIRC CSN threads contained more
discussion on this topic. If we do fix it, it seems likely that what
we need to wait on is not LSN, but some other kind of value. If the UI
were something like "WAITVISIBILITY token", then we can later change
the token to something other than LSN.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-26 Thread Robert Haas
On Tue, Sep 26, 2017 at 12:00 PM, Ants Aasma  wrote:
> Exposing this interface as WAITLSN will encode that visibility order
> matches LSN order.

That would be a bad thing to encode because it doesn't.

Well... actually on the standby it does, and that's the only thing
that matters in this case I guess.  But I agree with you that's it's
not a wonderful thing to bake into the UI, because we might want to
change it some day.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-23 Thread Ivan Kartyshov

Alexander Korotkov писал 2017-10-23 13:19:

Despite code cleanup, you still have some random empty lines removals
in your patch.


I reconfigured my IDE to avoid this in the future.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companydiff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 01acc2e..6792eb0 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml
new file mode 100644
index 000..6f389ca
--- /dev/null
+++ b/doc/src/sgml/ref/waitlsn.sgml
@@ -0,0 +1,144 @@
+
+
+
+ 
+  WAITLSN
+ 
+
+ 
+  WAITLSN
+  7
+  SQL - Language Statements
+ 
+
+ 
+  WAITLSN
+  wait for the target LSN to be replayed
+ 
+
+ 
+
+WAITLSN 'LSN' [ INFINITELY ]
+WAITLSN 'LSN' TIMEOUT wait_time
+WAITLSN 'LSN' NOWAIT
+
+ 
+
+ 
+  Description
+
+  
+   interprocess communication mechanism to wait for the target log sequence
+   number (LSN) on standby in 
+   databases with master-standby asynchronous replication. When run with the
+   LSN option, the WAITLSN command
+   waits for the specified LSN to be replayed. By default, wait
+   time is unlimited. Waiting can be interrupted using Ctrl+C, or
+   by shutting down the postgres server. You can also limit the wait
+   time using the TIMEOUT option, or check the target LSN
+   status immediately using the NOWAIT option.
+  
+
+ 
+
+ 
+  Parameters
+
+  
+   
+LSN
+
+ 
+	  Specify the target log sequence number to wait for.
+ 
+
+   
+   
+	INFINITELY
+
+ 
+	  Wait until the target LSN is replayed on standby.
+	  This is an optional parameter reinforcing the default behavior.
+ 
+
+   
+  
+ 
+
+   
+	TIMEOUT wait_time
+	
+	 
+	  Limit the time to wait for the LSN to be replayed.
+	  The specified wait_time must be an integer
+	  and is measured in milliseconds.
+	 
+	
+   
+
+   
+	NOWAIT
+	
+	 
+	  Report whether the target LSN has been replayed already,
+	  without any waiting.
+	 
+	
+   
+  
+
+ 
+  Examples
+
+  
+   Run WAITLSN from psql,
+   limiting wait time to 1 milliseconds:
+
+
+WAITLSN '0/3F07A6B1' TIMEOUT 1;
+NOTICE:  LSN is not reached. Try to increase wait time.
+LSN reached
+-
+ f
+(1 row)
+
+  
+
+  
+   Wait until the specified LSN is replayed:
+
+WAITLSN '0/3F07A611';
+LSN reached
+-
+ t
+(1 row)
+
+  
+
+  
+   Limit LSN wait time to 50 milliseconds, and then cancel the command:
+
+WAITLSN '0/3F0FF791' TIMEOUT 50;
+^CCancel request sent
+NOTICE:  LSN is not reached. Try to increase wait time.
+ERROR:  canceling statement due to user request
+ LSN reached
+-
+ f
+(1 row)
+
+
+ 
+
+ 
+  Compatibility
+
+  
+   There is no WAITLSN statement in the SQL
+   standard.
+  
+ 
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 9000b3a..0c5951a 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -209,6 +209,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dd028a1..117cc9b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -7312,6 +7312,15 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN())
+{
+
+	WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr);
+}
+
 /* Else, try to fetch the next WAL record */
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 4a6c99e..0d10117 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
 	policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \
 	schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \
 	tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \
-	vacuum.o vacuumlazy.o variable.o view.o
+	vacuum.o vacuumlazy.o variable.o view.o waitlsn.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/waitlsn.c b/src/backend/commands/waitlsn.c
new file mode 100644
index 000..db2f549
--- /dev/null
+++ b/src/backend/commands/waitlsn.c
@@ -0,0 +1,273 @@
+/*-
+ *
+ * waitlsn.c
+ *	  WaitLSN statment: WAITLSN
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-23 Thread Alexander Korotkov
On Mon, Oct 23, 2017 at 12:42 PM, Ivan Kartyshov  wrote:

> New little cleanup code changes
>

Despite code cleanup, you still have some random empty lines removals in
your patch.

@@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = {
>   {NULL, 0, false}
>  };
>
> -
>  /*
>   * Although only "on", "off", and "always" are documented,
>   * we accept all the likely variants of "on" and "off".



@@ -141,7 +141,6 @@
>  #include "utils/timestamp.h"
>  #include "utils/tqual.h"
>
> -
>  /*
>   * Maximum size of a NOTIFY payload, including terminating NULL.  This
>   * must be kept small enough so that a notification message fits on one


--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-23 Thread Ivan Kartyshov

New little cleanup code changes

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b
Author: i.kartyshov 
Date:   Mon Oct 23 12:08:59 2017 +0300

Cherry picked and ported 11dev

diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 01acc2e..6792eb0 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml
new file mode 100644
index 000..6f389ca
--- /dev/null
+++ b/doc/src/sgml/ref/waitlsn.sgml
@@ -0,0 +1,144 @@
+
+
+
+ 
+  WAITLSN
+ 
+
+ 
+  WAITLSN
+  7
+  SQL - Language Statements
+ 
+
+ 
+  WAITLSN
+  wait for the target LSN to be replayed
+ 
+
+ 
+
+WAITLSN 'LSN' [ INFINITELY ]
+WAITLSN 'LSN' TIMEOUT wait_time
+WAITLSN 'LSN' NOWAIT
+
+ 
+
+ 
+  Description
+
+  
+   interprocess communication mechanism to wait for the target log sequence
+   number (LSN) on standby in 
+   databases with master-standby asynchronous replication. When run with the
+   LSN option, the WAITLSN command
+   waits for the specified LSN to be replayed. By default, wait
+   time is unlimited. Waiting can be interrupted using Ctrl+C, or
+   by shutting down the postgres server. You can also limit the wait
+   time using the TIMEOUT option, or check the target LSN
+   status immediately using the NOWAIT option.
+  
+
+ 
+
+ 
+  Parameters
+
+  
+   
+LSN
+
+ 
+	  Specify the target log sequence number to wait for.
+ 
+
+   
+   
+	INFINITELY
+
+ 
+	  Wait until the target LSN is replayed on standby.
+	  This is an optional parameter reinforcing the default behavior.
+ 
+
+   
+  
+ 
+
+   
+	TIMEOUT wait_time
+	
+	 
+	  Limit the time to wait for the LSN to be replayed.
+	  The specified wait_time must be an integer
+	  and is measured in milliseconds.
+	 
+	
+   
+
+   
+	NOWAIT
+	
+	 
+	  Report whether the target LSN has been replayed already,
+	  without any waiting.
+	 
+	
+   
+  
+
+ 
+  Examples
+
+  
+   Run WAITLSN from psql,
+   limiting wait time to 1 milliseconds:
+
+
+WAITLSN '0/3F07A6B1' TIMEOUT 1;
+NOTICE:  LSN is not reached. Try to increase wait time.
+LSN reached
+-
+ f
+(1 row)
+
+  
+
+  
+   Wait until the specified LSN is replayed:
+
+WAITLSN '0/3F07A611';
+LSN reached
+-
+ t
+(1 row)
+
+  
+
+  
+   Limit LSN wait time to 50 milliseconds, and then cancel the command:
+
+WAITLSN '0/3F0FF791' TIMEOUT 50;
+^CCancel request sent
+NOTICE:  LSN is not reached. Try to increase wait time.
+ERROR:  canceling statement due to user request
+ LSN reached
+-
+ f
+(1 row)
+
+
+ 
+
+ 
+  Compatibility
+
+  
+   There is no WAITLSN statement in the SQL
+   standard.
+  
+ 
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 9000b3a..0c5951a 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -209,6 +209,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dd028a1..117cc9b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = {
 	{NULL, 0, false}
 };
 
-
 /*
  * Although only "on", "off", and "always" are documented,
  * we accept all the likely variants of "on" and "off".
@@ -7312,6 +7312,15 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN())
+{
+
+	WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr);
+}
+
 /* Else, try to fetch the next WAL record */
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 4a6c99e..0d10117 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
 	policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \
 	schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \
 	tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \
-	vacuum.o vacuumlazy.o variable.o view.o
+	vacuum.o vacuumlazy.o variable.o view.o waitlsn.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index f7de742..cdeddfc 100644
--- a/src/backend/commands/async.c
+++ 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-10-23 Thread Ivan Kartyshov

Alexander Korotkov писал 2017-09-26 12:07:

I propose following syntax options.

WAITLSN lsn;
WAITLSN lsn TIMEOUT delay;
WAITLSN lsn INFINITE;
WAITLSN lsn NOWAIT;

For me that looks rather better.  What do you think?


I agree with you, now syntax looks better.
New patch attached to tha mail.

Ants Aasma писал 2017-09-26 13:00:

Exposing this interface as WAITLSN will encode that visibility order
matches LSN order. This removes any chance of fixing for example
visibility order of async/vs/sync transactions. Just renaming it so
the token is an opaque commit visibility token that just happens to be
a LSN would still allow for progress in transaction management. For
example, making PostgreSQL distributed will likely want timestamp
and/or vector clock based visibility rules.


I'm sorry I did not understand exactly what you meant.
Please explain this in more detail.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit 217f842726531edb1b0056a5c5727ab01bab7f9b
Author: i.kartyshov 
Date:   Mon Oct 23 12:08:59 2017 +0300

Cherry picked and ported 11dev

diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 01acc2e..6792eb0 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml
new file mode 100644
index 000..6f389ca
--- /dev/null
+++ b/doc/src/sgml/ref/waitlsn.sgml
@@ -0,0 +1,144 @@
+
+
+
+ 
+  WAITLSN
+ 
+
+ 
+  WAITLSN
+  7
+  SQL - Language Statements
+ 
+
+ 
+  WAITLSN
+  wait for the target LSN to be replayed
+ 
+
+ 
+
+WAITLSN 'LSN' [ INFINITELY ]
+WAITLSN 'LSN' TIMEOUT wait_time
+WAITLSN 'LSN' NOWAIT
+
+ 
+
+ 
+  Description
+
+  
+   interprocess communication mechanism to wait for the target log sequence
+   number (LSN) on standby in 
+   databases with master-standby asynchronous replication. When run with the
+   LSN option, the WAITLSN command
+   waits for the specified LSN to be replayed. By default, wait
+   time is unlimited. Waiting can be interrupted using Ctrl+C, or
+   by shutting down the postgres server. You can also limit the wait
+   time using the TIMEOUT option, or check the target LSN
+   status immediately using the NOWAIT option.
+  
+
+ 
+
+ 
+  Parameters
+
+  
+   
+LSN
+
+ 
+	  Specify the target log sequence number to wait for.
+ 
+
+   
+   
+	INFINITELY
+
+ 
+	  Wait until the target LSN is replayed on standby.
+	  This is an optional parameter reinforcing the default behavior.
+ 
+
+   
+  
+ 
+
+   
+	TIMEOUT wait_time
+	
+	 
+	  Limit the time to wait for the LSN to be replayed.
+	  The specified wait_time must be an integer
+	  and is measured in milliseconds.
+	 
+	
+   
+
+   
+	NOWAIT
+	
+	 
+	  Report whether the target LSN has been replayed already,
+	  without any waiting.
+	 
+	
+   
+  
+
+ 
+  Examples
+
+  
+   Run WAITLSN from psql,
+   limiting wait time to 1 milliseconds:
+
+
+WAITLSN '0/3F07A6B1' TIMEOUT 1;
+NOTICE:  LSN is not reached. Try to increase wait time.
+LSN reached
+-
+ f
+(1 row)
+
+  
+
+  
+   Wait until the specified LSN is replayed:
+
+WAITLSN '0/3F07A611';
+LSN reached
+-
+ t
+(1 row)
+
+  
+
+  
+   Limit LSN wait time to 50 milliseconds, and then cancel the command:
+
+WAITLSN '0/3F0FF791' TIMEOUT 50;
+^CCancel request sent
+NOTICE:  LSN is not reached. Try to increase wait time.
+ERROR:  canceling statement due to user request
+ LSN reached
+-
+ f
+(1 row)
+
+
+ 
+
+ 
+  Compatibility
+
+  
+   There is no WAITLSN statement in the SQL
+   standard.
+  
+ 
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 9000b3a..0c5951a 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -209,6 +209,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index dd028a1..117cc9b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -149,7 +150,6 @@ const struct config_enum_entry sync_method_options[] = {
 	{NULL, 0, false}
 };
 
-
 /*
  * Although only "on", "off", and "always" are documented,
  * we accept all the likely variants of "on" and "off".
@@ -7312,6 +7312,15 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN())
+{
+
+	WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr);
+}
+
 /* Else, try to fetch the next WAL record */
 record = 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-26 Thread Ants Aasma
On Tue, Aug 15, 2017 at 5:00 AM, Craig Ringer  wrote:
> On 22 March 2017 at 01:17, Robert Haas  wrote:
>>
>> On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro
>>  wrote:
>> > Maybe someone can think of a clever way for an extension to insert a
>> > wait for a user-supplied LSN *before* acquiring a snapshot so it can
>> > work for the higher levels, or maybe the hooks should go into core
>> > PostgreSQL so that the extension can exist as an external project not
>> > requiring a patched PostgreSQL installation, or maybe this should be
>> > done with new core syntax that extends transaction commands.  Do other
>> > people have views on this?
>>
>> IMHO, trying to do this using a function-based interface is a really
>> bad idea for exactly the reasons you mention.  I don't see why we'd
>> resist the idea of core syntax here; transactions are a core part of
>> PostgreSQL.
>>
>> There is, of course, the question of whether making LSNs such a
>> user-visible thing is a good idea in the first place, but that's a
>> separate question from issue of what syntax for such a thing is best.
>
>
> (I know this is old, but):
>
> That ship sailed a long time ago unfortunately, they're all over
> pg_stat_replication and pg_replication_slots and so on. They're already
> routinely used for monitoring replication lag in bytes, waiting for a peer
> to catch up, etc.

(continuing the trend of resurrecting old topics)

Exposing this interface as WAITLSN will encode that visibility order
matches LSN order. This removes any chance of fixing for example
visibility order of async/vs/sync transactions. Just renaming it so
the token is an opaque commit visibility token that just happens to be
a LSN would still allow for progress in transaction management. For
example, making PostgreSQL distributed will likely want timestamp
and/or vector clock based visibility rules.

Regards,
Ants Aasma


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-26 Thread Alexander Korotkov
On Mon, Jan 23, 2017 at 2:56 PM, Ivan Kartyshov 
wrote:

> How to use it
> ==
> WAITLSN ‘LSN’ [, timeout in ms];
> WAITLSN_INFINITE ‘LSN’;
> WAITLSN_NO_WAIT ‘LSN’;


Adding suffix to the command name looks weird.  We don't do so for any
other command.
I propose following syntax options.

WAITLSN lsn;
WAITLSN lsn TIMEOUT delay;
WAITLSN lsn INFINITE;
WAITLSN lsn NOWAIT;

For me that looks rather better.  What do you think?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-04 Thread i . kartyshov

I forget to include patch in last letter.

Craig Ringer wrote 2017-08-15 05:00:

That ship sailed a long time ago unfortunately, they're all over
pg_stat_replication and pg_replication_slots and so on. They're
already routinely used for monitoring replication lag in bytes,
waiting for a peer to catch up, etc.

--

 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Function pg_replication_slots is only master, and waitlsn is async hot 
standby replication function. It allows us to wait untill insert made on 
master is be replayed on replica.


--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 01acc2e..6792eb0 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml
new file mode 100644
index 000..077e869
--- /dev/null
+++ b/doc/src/sgml/ref/waitlsn.sgml
@@ -0,0 +1,119 @@
+
+
+
+ 
+  WAITLSN
+ 
+
+ 
+  WAITLSN
+  7
+  SQL - Language Statements
+ 
+
+ 
+  WAITLSN
+  wait until target LSN has been replayed
+ 
+
+ 
+
+WAITLSN 'LSN' [ , delay ]
+WAITLSN_INFINITE 'LSN'
+WAITLSN_NO_WAIT 'LSN'
+
+ 
+
+ 
+  Description
+
+  
+   The WAITLSN wait till target LSN will
+   be replayed with an optional delay (milliseconds by default
+   infinity) to be wait for LSN to replayed.
+  
+
+  
+   WAITLSN provides a simple
+   interprocess LSN wait mechanism for a backends on slave
+   in master-slave replication scheme on PostgreSQL database.
+  
+
+  
+   The WAITLSN_INFINITE wait till target LSN will
+   be replayed on slave .
+  
+
+  
+   The WAITLSN_NO_WAIT will tell if target LSN was replayed without any wait.
+  
+ 
+
+ 
+  Parameters
+
+  
+   
+LSN
+
+ 
+  Target log sequence number to be wait for.
+ 
+
+   
+   
+delay
+
+ 
+  Time in miliseconds to waiting for LSN to be replayed.
+ 
+
+   
+  
+ 
+
+ 
+  Notes
+
+  
+   Delay time for waiting till LSN to be replayed must be integer. For
+   default it is infinity. Waiting can be interupped using Ctl+C, or
+   by Postmaster death.
+  
+ 
+
+ 
+  Examples
+
+  
+   Configure and execute a waitlsn from
+   psql:
+
+
+WAITLSN '0/3F07A6B1', 1;
+NOTICE:  LSN is not reached. Try to make bigger delay.
+WAITLSN
+
+WAITLSN '0/3F07A611';
+WAITLSN
+
+WAITLSN '0/3F0FF791', 50;
+^CCancel request sent
+NOTICE:  LSN is not reached. Try to make bigger delay.
+ERROR:  canceling statement due to user request
+
+
+ 
+
+ 
+  Compatibility
+
+  
+   There is no WAITLSN statement in the SQL
+   standard.
+  
+ 
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 9000b3a..0c5951a 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -209,6 +209,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index df4843f..9dfb59d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -40,6 +40,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -147,7 +148,6 @@ const struct config_enum_entry sync_method_options[] = {
 	{NULL, 0, false}
 };
 
-
 /*
  * Although only "on", "off", and "always" are documented,
  * we accept all the likely variants of "on" and "off".
@@ -7237,6 +7237,15 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (XLogCtl->lastReplayedEndRecPtr >= GetMinWaitLSN())
+{
+
+	WaitLSNSetLatch(XLogCtl->lastReplayedEndRecPtr);
+}
+
 /* Else, try to fetch the next WAL record */
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 4a6c99e..0d10117 100644
--- a/src/backend/commands/Makefile
+++ b/src/backend/commands/Makefile
@@ -20,6 +20,6 @@ OBJS = amcmds.o aggregatecmds.o alter.o analyze.o async.o cluster.o comment.o \
 	policy.o portalcmds.o prepare.o proclang.o publicationcmds.o \
 	schemacmds.o seclabel.o sequence.o statscmds.o subscriptioncmds.o \
 	tablecmds.o tablespace.o trigger.o tsearchcmds.o typecmds.o user.o \
-	vacuum.o vacuumlazy.o variable.o view.o
+	vacuum.o vacuumlazy.o variable.o view.o waitlsn.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index bacc08e..7d6011f 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -139,7 +139,6 @@
 #include "utils/ps_status.h"
 #include "utils/timestamp.h"
 
-
 /*
  

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-09-04 Thread i . kartyshov

Hello, thank you for your comments over main idea and code.

On 13.03.2017 05:20, Thomas Munro wrote:
1)

First, I'll restate my view of the syntax-vs-function question:  I
think an fmgr function is the wrong place to do this, because it
doesn't work for our 2 higher isolation levels as mentioned.  Most
people probably use READ COMMITTED most of the time so the extension
version you've proposed is certainly useful for many people and I like
it, but I will vote against inclusion in core of any feature that
doesn't consider all three of our isolation levels, especially if
there is no way to extend support to other levels later.  I don't want
PostgreSQL to keep adding features that eventually force everyone to
use READ COMMITTED because they want to use feature X, Y or Z.


Waiting for LSN is expected to be used on hot standby READ ONLY 
replication.

Only READ COMMITTED and REPEATABLE READ, are allowed on hot standby.


Maybe someone can think of a clever way for an extension to insert a
wait for a user-supplied LSN *before* acquiring a snapshot so it can
work for the higher levels, or maybe the hooks should go into core
PostgreSQL so that the extension can exist as an external project not
requiring a patched PostgreSQL installation, or maybe this should be
done with new core syntax that extends transaction commands.  Do other
people have views on this?


I think it is a good idea, but it is not clear for me, how to do it 
properly.


2)

+wl_lsn_updated_hook(void)
+{
+uint i;
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (counter_waitlsn % count_waitlsn == 0
+|| 
TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn))

+{

Doesn't this mean that if you are waiting for LSN 1234, and the
primary sends that LSN and then doesn't send anything for another
hour, a standby waiting in pg_waitlsn is quite likely to skip that
update (either because of count_waitlsn or interval_waitlsn), and then
finish up waiting for a very long time?

I'm not sure if this is a good idea, but it's an idea:  You could keep
your update skipping logic, but make sure you always catch the
important case where recovery hits the end of WAL, by invoking your
callback from WaitForWALToBecomeAvailable.  It could have a boolean
parameter that means 'don't skip this one!'.  In other words, it's OK
to skip updates, but not if you know there is no more WAL available to
apply (since you have no idea how long it will be for more to arrive).

Calling GetCurrentTimestamp() at high frequency (after every WAL
record is replayed) may be a bad idea.  It's a slow system call on
some operating systems.  Can we use an existing timestamp for free,
like recoveryLastXTime?  Remembering that the motivation for using
this feature is to wait for *whole transactions* to be replayed and
become visible, there is no sensible reason to wait for random WAL
positions inside a transaction, so if you used that then you would
effectively skip all non-COMMIT records and also skip some COMMIT
records that are coming down the pipe too fast.


Yes, I applied this idea and prepared new patch.

3)

+static void
+wl_own_latch(void)
+{
+SpinLockAcquire(>l_arr[MyBackendId].slock);
+OwnLatch(>l_arr[MyBackendId].latch);
+is_latch_owned = true;
+
+if (state->backend_maxid < MyBackendId)
+state->backend_maxid = MyBackendId;
+
+state->l_arr[MyBackendId].pid = MyProcPid;
+SpinLockRelease(>l_arr[MyBackendId].slock);
+}

What is the point of using extra latches for this?  Why not just use
the standard latch?  Syncrep and many other things do that.  I'm not
actually sure if there is ever a reason to create more latches in
regular backends.  SIGUSR1 will be delivered and set the main latch
anyway.

There are cases of specialised latches in the system, like the wal
receiver latch, and I'm reliably informed that that solves problems
like delivering a wakeup message without having to know which backend
is currently the wal receiver (a problem we might consider solving
today with a condition variable?)  I don't think anything like that
applies here.


In my case I create a bunch of shared latches for each backend, I`ll 
think

how to use standard latches in an efficient way. And about specialized
latches on standby they are already in busy with wal replaying 
functions.


4)

+for (i = 0; i <= state->backend_maxid; i++)
+{
+SpinLockAcquire(>l_arr[i].slock);
+if (state->l_arr[i].pid != 0)
+SetLatch(>l_arr[i].latch);
+SpinLockRelease(>l_arr[i].slock);
+}

Once we get through the update-skipping logic above, we hit this loop
which acquires  spinlocks for every backend one after another and sets
the latches of every backend, no matter whether they are waiting for
the LSN that has been applied.  Assuming we go with this
scan-the-whole-array approach, why not include the LSN waited for in
the array slots, 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-08-14 Thread Craig Ringer
On 22 March 2017 at 01:17, Robert Haas  wrote:

> On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro
>  wrote:
> > Maybe someone can think of a clever way for an extension to insert a
> > wait for a user-supplied LSN *before* acquiring a snapshot so it can
> > work for the higher levels, or maybe the hooks should go into core
> > PostgreSQL so that the extension can exist as an external project not
> > requiring a patched PostgreSQL installation, or maybe this should be
> > done with new core syntax that extends transaction commands.  Do other
> > people have views on this?
>
> IMHO, trying to do this using a function-based interface is a really
> bad idea for exactly the reasons you mention.  I don't see why we'd
> resist the idea of core syntax here; transactions are a core part of
> PostgreSQL.
>
> There is, of course, the question of whether making LSNs such a
> user-visible thing is a good idea in the first place, but that's a
> separate question from issue of what syntax for such a thing is best.


(I know this is old, but):

That ship sailed a long time ago unfortunately, they're all over
pg_stat_replication and pg_replication_slots and so on. They're already
routinely used for monitoring replication lag in bytes, waiting for a peer
to catch up, etc.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-08-14 Thread Peter Eisentraut
On 3/9/17 07:49, Ivan Kartyshov wrote:
> Here I attached rebased patch waitlsn_10dev_v3 (core feature)
> I will leave the choice of implementation (core/contrib) to the 
> discretion of the community.

This patch is registered in the upcoming commit fest, but it needs to be
rebased.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-24 Thread David Steele

Hi Ivan,

On 3/21/17 1:06 PM, David Steele wrote:

Hi Ivan,

On 3/12/17 10:20 PM, Thomas Munro wrote:

On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov
 wrote:

Here I attached rebased patch waitlsn_10dev_v3 (core feature)
I will leave the choice of implementation (core/contrib) to the
discretion
of the community.

Will be glad to hear your suggestion about syntax, code and patch.


Hi Ivan,

Here is some feedback based on a first read-through of the v4 patch.
I haven't tested it yet.


This thread has been idle for over a week.  Please respond and/or post a
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be
marked "Returned with Feedback".


This submission has been marked "Returned with Feedback".  Please feel 
free to resubmit to a future commitfest.


Regards,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-21 Thread Robert Haas
On Sun, Mar 12, 2017 at 10:20 PM, Thomas Munro
 wrote:
> Maybe someone can think of a clever way for an extension to insert a
> wait for a user-supplied LSN *before* acquiring a snapshot so it can
> work for the higher levels, or maybe the hooks should go into core
> PostgreSQL so that the extension can exist as an external project not
> requiring a patched PostgreSQL installation, or maybe this should be
> done with new core syntax that extends transaction commands.  Do other
> people have views on this?

IMHO, trying to do this using a function-based interface is a really
bad idea for exactly the reasons you mention.  I don't see why we'd
resist the idea of core syntax here; transactions are a core part of
PostgreSQL.

There is, of course, the question of whether making LSNs such a
user-visible thing is a good idea in the first place, but that's a
separate question from issue of what syntax for such a thing is best.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-21 Thread David Steele

Hi Ivan,

On 3/12/17 10:20 PM, Thomas Munro wrote:

On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov
 wrote:

Here I attached rebased patch waitlsn_10dev_v3 (core feature)
I will leave the choice of implementation (core/contrib) to the discretion
of the community.

Will be glad to hear your suggestion about syntax, code and patch.


Hi Ivan,

Here is some feedback based on a first read-through of the v4 patch.
I haven't tested it yet.


This thread has been idle for over a week.  Please respond and/or post a 
new patch by 2017-03-24 00:00 AoE (UTC-12) or this submission will be 
marked "Returned with Feedback".


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-12 Thread Thomas Munro
On Fri, Mar 10, 2017 at 1:49 AM, Ivan Kartyshov
 wrote:
> Here I attached rebased patch waitlsn_10dev_v3 (core feature)
> I will leave the choice of implementation (core/contrib) to the discretion
> of the community.
>
> Will be glad to hear your suggestion about syntax, code and patch.

Hi Ivan,

Here is some feedback based on a first read-through of the v4 patch.
I haven't tested it yet.

First, I'll restate my view of the syntax-vs-function question:  I
think an fmgr function is the wrong place to do this, because it
doesn't work for our 2 higher isolation levels as mentioned.  Most
people probably use READ COMMITTED most of the time so the extension
version you've proposed is certainly useful for many people and I like
it, but I will vote against inclusion in core of any feature that
doesn't consider all three of our isolation levels, especially if
there is no way to extend support to other levels later.  I don't want
PostgreSQL to keep adding features that eventually force everyone to
use READ COMMITTED because they want to use feature X, Y or Z.

Maybe someone can think of a clever way for an extension to insert a
wait for a user-supplied LSN *before* acquiring a snapshot so it can
work for the higher levels, or maybe the hooks should go into core
PostgreSQL so that the extension can exist as an external project not
requiring a patched PostgreSQL installation, or maybe this should be
done with new core syntax that extends transaction commands.  Do other
people have views on this?

+ * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group

This should say PostgreSQL.

+wl_lsn_updated_hook(void)
+{
+uint i;
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (counter_waitlsn % count_waitlsn == 0
+|| 
TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn))
+{

Doesn't this mean that if you are waiting for LSN 1234, and the
primary sends that LSN and then doesn't send anything for another
hour, a standby waiting in pg_waitlsn is quite likely to skip that
update (either because of count_waitlsn or interval_waitlsn), and then
finish up waiting for a very long time?

I'm not sure if this is a good idea, but it's an idea:  You could keep
your update skipping logic, but make sure you always catch the
important case where recovery hits the end of WAL, by invoking your
callback from WaitForWALToBecomeAvailable.  It could have a boolean
parameter that means 'don't skip this one!'.  In other words, it's OK
to skip updates, but not if you know there is no more WAL available to
apply (since you have no idea how long it will be for more to arrive).

Calling GetCurrentTimestamp() at high frequency (after every WAL
record is replayed) may be a bad idea.  It's a slow system call on
some operating systems.  Can we use an existing timestamp for free,
like recoveryLastXTime?  Remembering that the motivation for using
this feature is to wait for *whole transactions* to be replayed and
become visible, there is no sensible reason to wait for random WAL
positions inside a transaction, so if you used that then you would
effectively skip all non-COMMIT records and also skip some COMMIT
records that are coming down the pipe too fast.

+static void
+wl_own_latch(void)
+{
+SpinLockAcquire(>l_arr[MyBackendId].slock);
+OwnLatch(>l_arr[MyBackendId].latch);
+is_latch_owned = true;
+
+if (state->backend_maxid < MyBackendId)
+state->backend_maxid = MyBackendId;
+
+state->l_arr[MyBackendId].pid = MyProcPid;
+SpinLockRelease(>l_arr[MyBackendId].slock);
+}

What is the point of using extra latches for this?  Why not just use
the standard latch?  Syncrep and many other things do that.  I'm not
actually sure if there is ever a reason to create more latches in
regular backends.  SIGUSR1 will be delivered and set the main latch
anyway.

There are cases of specialised latches in the system, like the wal
receiver latch, and I'm reliably informed that that solves problems
like delivering a wakeup message without having to know which backend
is currently the wal receiver (a problem we might consider solving
today with a condition variable?)  I don't think anything like that
applies here.

+for (i = 0; i <= state->backend_maxid; i++)
+{
+SpinLockAcquire(>l_arr[i].slock);
+if (state->l_arr[i].pid != 0)
+SetLatch(>l_arr[i].latch);
+SpinLockRelease(>l_arr[i].slock);
+}

Once we get through the update-skipping logic above, we hit this loop
which acquires  spinlocks for every backend one after another and sets
the latches of every backend, no matter whether they are waiting for
the LSN that has been applied.  Assuming we go with this
scan-the-whole-array approach, why not include the LSN waited for in
the array slots, so that we can avoid disturbing processes that are
waiting for a later LSN?

Could you talk a bit about 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-09 Thread Ivan Kartyshov

Hello

On 07.03.2017 15:58, Masahiko Sawada wrote:

I've read the discussion so far but I didn't see the reason why you've
changed it to as a contrib module. Could you tell me about that?


I did it on the initiative of our customer, who preferred the 
implementation in the form of contrib. Contrib realization of WAITLSN 
adds to core the only hook.


On 07.03.2017 15:58, Masahiko Sawada wrote:
> I guess this feature would be more useful if provided as a core
> feature and we need to discuss about syntax as Thomas mentioned.

Here I attached rebased patch waitlsn_10dev_v3 (core feature)
I will leave the choice of implementation (core/contrib) to the 
discretion of the community.


Will be glad to hear your suggestion about syntax, code and patch.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 2bc4d9f..6d5a81e 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -178,6 +178,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/waitlsn.sgml b/doc/src/sgml/ref/waitlsn.sgml
new file mode 100644
index 000..338187b
--- /dev/null
+++ b/doc/src/sgml/ref/waitlsn.sgml
@@ -0,0 +1,134 @@
+
+
+
+ 
+  WAITLSN
+ 
+
+ 
+  WAITLSN
+  7
+  SQL - Language Statements
+ 
+
+ 
+  WAITLSN
+  wait until target LSN has been replayed
+ 
+
+ 
+
+WAITLSN 'LSN' [ , delay ]
+WAITLSN_INFINITE 'LSN'
+WAITLSN_NO_WAIT 'LSN'
+
+ 
+
+ 
+  Description
+
+  
+   The WAITLSN wait till target LSN will
+   be replayed with an optional delay (milliseconds by default
+   infinity) to be wait for LSN to replayed.
+  
+
+  
+   WAITLSN provides a simple
+   interprocess LSN wait mechanism for a backends on slave
+   in master-slave replication scheme on PostgreSQL database.
+  
+
+  
+   The WAITLSN_INFINITE wait till target LSN will
+   be replayed on slave .
+  
+
+  
+   The WAITLSN_NO_WAIT will tell if target LSN was replayed without any wait.
+  
+ 
+
+ 
+  Parameters
+
+  
+   
+LSN
+
+ 
+  Target log sequence number to be wait for.
+ 
+
+   
+   
+delay
+
+ 
+  Time in miliseconds to waiting for LSN to be replayed.
+ 
+
+   
+  
+ 
+
+ 
+  Notes
+
+  
+   Delay time for waiting till LSN to be replayed must be integer. For
+   default it is infinity. Waiting can be interupped using Ctl+C, or
+   by Postmaster death.
+  
+ 
+
+ 
+  GUCs
+
+  
+ Add two GUCs which help tuning influence on StartupXLOG:
+ count_waitlsn (denominator to check not each LSN)
+ int	count_waitlsn= 10;
+  
+
+  
+ interval_waitlsn (Interval in milliseconds to additional LSN check)
+ int	interval_waitlsn = 100;
+  
+ 
+
+ 
+  Examples
+
+  
+   Configure and execute a waitlsn from
+   psql:
+
+
+WAITLSN '0/3F07A6B1', 1;
+NOTICE:  LSN is not reached. Try to make bigger delay.
+WAITLSN
+
+WAITLSN '0/3F07A611';
+WAITLSN
+
+WAITLSN '0/3F0FF791', 50;
+^CCancel request sent
+NOTICE:  LSN is not reached. Try to make bigger delay.
+ERROR:  canceling statement due to user request
+
+
+ 
+
+ 
+  Compatibility
+
+  
+   There is no WAITLSN statement in the SQL
+   standard.
+  
+ 
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index c8191de..a5100a2 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -206,6 +206,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 2dcff7f..9c2def1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -39,6 +39,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -145,6 +146,9 @@ const struct config_enum_entry sync_method_options[] = {
 	{NULL, 0, false}
 };
 
+/* GUC variable */
+intcount_waitlsn = 10;
+intinterval_waitlsn = 100;
 
 /*
  * Although only "on", "off", and "always" are documented,
@@ -6948,6 +6952,8 @@ StartupXLOG(void)
 		{
 			ErrorContextCallback errcallback;
 			TimestampTz xtime;
+			TimestampTz			time_waitlsn = GetCurrentTimestamp();
+			int	counter_waitlsn = 0;
 
 			InRedo = true;
 
@@ -7174,6 +7180,17 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (counter_waitlsn % count_waitlsn == 0
+	|| TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn))
+{
+	WaitLSNSetLatch();
+	time_waitlsn = GetCurrentTimestamp();
+}
+counter_waitlsn++;
+
 /* Else, try to fetch the next WAL record */
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index e0fab38..8a7e2bd 100644
--- 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-07 Thread Thomas Munro
On Wed, Mar 8, 2017 at 1:58 AM, Masahiko Sawada  wrote:
> On Tue, Mar 7, 2017 at 8:48 PM, Ivan Kartyshov
>  wrote:
>> Rebase done.
>
> Thank you for updating the patch.
>
>>
>> Meanwhile I made some more changes.
>>
>> Changes
>> ===
>> 1) WAITLSN is now implemented as an extension called "pg_waitlsn"
>
> I've read the discussion so far but I didn't see the reason why you've
> changed it to as a contrib module. Could you tell me about that? I
> guess this feature would be more useful if provided as a core feature
> and we need to discuss about syntax as Thomas mentioned.

The problem with using functions like pg_waitlsn(‘LSN’ [, timeout in
ms]) instead of new syntax for transaction starting commands like
BEGIN TRANSACTION ... WAIT FOR ... is that it doesn't work for the
higher isolation levels.  In READ COMMITTED it's fine, because every
statement runs with its own snapshot, so when SELECT
pg_waitlsn(some_lsn) returns, the next statement will run with a
snapshot that can see the effects of some_lsn being applied.  But in
REPEATABLE READ and SERIALIZABLE, even though pg_waitlsn(some_lsn)
waits for the LSN to be applied, the next statement will run with the
snapshot from before and never see the transaction you were waiting
for.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-07 Thread Masahiko Sawada
On Tue, Mar 7, 2017 at 8:48 PM, Ivan Kartyshov
 wrote:
> Rebase done.

Thank you for updating the patch.

>
> Meanwhile I made some more changes.
>
> Changes
> ===
> 1) WAITLSN is now implemented as an extension called "pg_waitlsn"

I've read the discussion so far but I didn't see the reason why you've
changed it to as a contrib module. Could you tell me about that? I
guess this feature would be more useful if provided as a core feature
and we need to discuss about syntax as Thomas mentioned.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-07 Thread Ivan Kartyshov

Rebase done.

Meanwhile I made some more changes.

Changes
===
1) WAITLSN is now implemented as an extension called "pg_waitlsn"

2) Call new hook "lsn_updated_hook" right after xact_redo_commit (xlog.c)

3) Corresponding functions:
pg_waitlsn('0/693FF800', 1) - wait 10 seconds
pg_waitlsn_infinite('0/693FF800') - for infinite wait
pg_waitlsn_no_wait('0/693FF800') - once check if LSN was replayed or not.

4) Add two GUCs which help tuning influence on StartupXLOG:
count_waitlsn (denominator to check not each LSN)
int count_waitlsn= 10;

interval_waitlsn (Interval in milliseconds to additional LSN check)
int interval_waitlsn = 100;

5) Optimize loop that set latches.

How to use it
==
Master:
1) Make "wal_level = replica"
Slave:
2) Add  shared_preload_libraries = 'pg_waitlsn'
hot_standby = on (in postgresql.conf)
3) Create extension pg_waitlsn;
4) And in hot_standby you can wait for LSN (pgsleep), when LSN will 
replayed on slave pg_waitlsn will release


select pg_waitlsn(‘LSN’ [, timeout in ms]);
select pg_waitlsn_infinite(‘LSN’);
select pg_waitlsn_no_wait(‘LSN’);

#Wait until LSN 0/303EC60 will be replayed, or 10 second passed.
select pg_waitlsn(‘0/303EC60’, 1);

#Or same without timeout.
select pg_waitlsn(‘0/303EC60’);
select pg_waitlsn_infinite('0/693FF800');

#To check if LSN is replayed can be used.
select pg_waitlsn_no_wait('0/693FF800');

Notice: select pg_waitlsn will release on PostmasterDeath or 
Interruption events if they come earlier then target LSN or timeout.


--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/contrib/pg_waitlsn/Makefile b/contrib/pg_waitlsn/Makefile
new file mode 100644
index 000..49a326c
--- /dev/null
+++ b/contrib/pg_waitlsn/Makefile
@@ -0,0 +1,21 @@
+# pg_waitlsn/Makefile
+
+MODULE_big = pg_waitlsn
+OBJS = pg_waitlsn.o
+EXTENSION = pg_waitlsn
+DATA = pg_waitlsn--1.0.sql
+
+
+
+ifdef USE_PGXS
+
+	PG_CONFIG = pg_config
+	PGXS := $( shell $( PG_CONFIG ) --pgxs )
+	include $(PGXS)
+else
+
+	subdir = contrib/pg_waitlsn
+	top_builddir = ../..
+	include $(top_builddir)/src/Makefile.global
+	include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_waitlsn/pg_waitlsn--1.0.sql b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql
new file mode 100644
index 000..8b251f3
--- /dev/null
+++ b/contrib/pg_waitlsn/pg_waitlsn--1.0.sql
@@ -0,0 +1,19 @@
+/* contrib/pg_waitlsn/pg_waitlsn--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_waitlsn" to wait target LSN to been replayed, delay for waiting in miliseconds (default infinity) \quit
+
+CREATE FUNCTION pg_waitlsn(lsn pg_lsn, delay int default 0)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_waitlsn'
+LANGUAGE C IMMUTABLE STRICT ;
+
+CREATE FUNCTION pg_waitlsn_infinite(lsn pg_lsn)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_waitlsn_infinite'
+LANGUAGE C IMMUTABLE STRICT ;
+
+CREATE FUNCTION pg_waitlsn_no_wait(lsn pg_lsn)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_waitlsn_no_wait'
+LANGUAGE C IMMUTABLE STRICT ;
diff --git a/contrib/pg_waitlsn/pg_waitlsn.c b/contrib/pg_waitlsn/pg_waitlsn.c
new file mode 100644
index 000..d210678
--- /dev/null
+++ b/contrib/pg_waitlsn/pg_waitlsn.c
@@ -0,0 +1,299 @@
+/*-
+ *
+ * pg_waitlsn
+ *
+ * Portions Copyright (c) 2012-2017, PostgresPro Global Development Group
+ *
+ * IDENTIFICATION
+ *		  contrib/pg_waitlsn/pg_waitlsn.c
+ *
+ *-
+ */
+
+#include "postgres.h"
+#include "fmgr.h"
+#include "pgstat.h"
+#include "access/xlog.h"
+#include "utils/pg_lsn.h"
+#include "storage/latch.h"
+#include "miscadmin.h"
+#include "storage/spin.h"
+#include "storage/backendid.h"
+#include "access/xact.h"
+#include "storage/shmem.h"
+#include "storage/ipc.h"
+#include "utils/timestamp.h"
+#include "storage/pmsignal.h"
+#include "storage/proc.h"
+#include "access/transam.h"
+#include "utils/guc.h"
+
+PG_MODULE_MAGIC;
+
+static bool pg_waitlsn_internal(XLogRecPtr lsn, uint64_t delay);
+
+/* Hooks values */
+static lsn_updated_hook_type prev_lsn_updated_hook = NULL;
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static void wl_lsn_updated_hook(void);
+static uint32 estimate_shmem_size(void);
+
+/* Latches Own-DisownLatch and AbortCаllBack */
+static void disown_latches_on_abort(XactEvent event, void *arg);
+static void wl_own_latch(void);
+static void wl_disown_latch(void);
+
+/* GUC variable */
+intcount_waitlsn = 10;
+intinterval_waitlsn = 100;
+
+/* Globals */
+TimestampTz		time_waitlsn = 0;
+intcounter_waitlsn = 0;
+
+void			_PG_init(void);
+
+/* Shared memory structures */
+typedef struct
+{
+	int	pid;
+	volatile slock_t	slock;
+	Latchlatch;
+} BIDLatch;
+
+typedef struct
+{
+	char		dummy;
+	int			backend_maxid;
+	BIDLatch	l_arr[FLEXIBLE_ARRAY_MEMBER];
+} GlobState;

Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-03-03 Thread David Steele
Hi Ivan,

On 2/27/17 3:52 PM, Thomas Munro wrote:
> On Thu, Feb 23, 2017 at 3:08 AM, Thom Brown  wrote:
>> On 23 January 2017 at 11:56, Ivan Kartyshov  
>> wrote:
>>>
>>> Thank you for reading, will be glad to get your feedback.
>>
>> Could you please rebase your patch as it no longer applies cleanly.
> 
> +1

Please provide a rebased patch as soon as possible.

Thanks,
-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-02-27 Thread Thomas Munro
On Thu, Feb 23, 2017 at 3:08 AM, Thom Brown  wrote:
> On 23 January 2017 at 11:56, Ivan Kartyshov  
> wrote:
>>
>> Thank you for reading, will be glad to get your feedback.
>
> Could you please rebase your patch as it no longer applies cleanly.

+1

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-02-22 Thread Thom Brown
On 23 January 2017 at 11:56, Ivan Kartyshov  wrote:
> Thank you for reviews and suggested improvements.
> I rewrote patch to make it more stable.
>
> Changes
> ===
> I've made a few changes:
> 1) WAITLSN now doesn`t depend on snapshot
> 2) Check current replayed LSN rather than in xact_redo_commit
> 3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and
> WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you
> advised.
> 4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5
> doesn`t exist).
> 5) Optimize loop that set latches.
> 6) Add two GUCs that helps us to configure influence on StartupXLOG:
> count_waitlsn (denominator to check not each LSN)
> interval_waitlsn (Interval in milliseconds to additional LSN check)
>
> Feedback
> 
> On 09/15/2016 05:41 AM, Thomas Munro wrote:
>>
>> You hold a spinlock in one arbitrary slot, but that
>> doesn't seem sufficient: another backend may also read it, compute a
>> new value and then write it, while holding a different spin lock.  Or
>> am I missing something?
>
>
> We acquire an individual spinlock on each member of array, so you cannot
> compute new value and write it concurrently.
>
> Tested
> ==
> We have been tested it on different servers and OS`s, in different cases and
> workloads. New version is nearly as fast as vanilla on primary and bring
> tiny influence on standby performance.
>
> Hardware:
> 144 Intel Cores with HT
> 3TB RAM
> all data on ramdisk
> primary + hotstandby  on the same node.
>
> A dataset was created with "pgbench -i -s 1000" command. For each round of
> test we pause replay on standby, make 100 transaction on primary with
> pgbench, start replay on standby and measure replication gap disappearing
> time under different standby workload. The workload was "WAITLSN
> ('Very/FarLSN', 1000ms timeout)" followed by "select abalance from
> pgbench_accounts there aid = random_aid;"
> For vanilla 1000ms timeout was enforced on pgbench side by -R option.
> GUC waitlsn parameters was adopted for 1000ms timeout on standby with 35000
> tps rate on primary.
> interval_waitlsn = 500 (ms)
> count_waitlsn = 3
>
> On 200 clients, slave caching up master as vanilla without significant
> delay.
> On 500 clients, slave caching up master 3% slower then vanilla.
> On 1000 clients, 12% slower.
> On 5000 clients, 3 time slower because it far above our hardware ability.
>
> How to use it
> ==
> WAITLSN ‘LSN’ [, timeout in ms];
> WAITLSN_INFINITE ‘LSN’;
> WAITLSN_NO_WAIT ‘LSN’;
>
> #Wait until LSN 0/303EC60 will be replayed, or 10 second passed.
> WAITLSN ‘0/303EC60’, 1;
>
> #Or same without timeout.
> WAITLSN ‘0/303EC60’;
> orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch
> WAITLSN_INFINITE '0/693FF800';
>
> #To check if LSN is replayed can be used.
> WAITLSN_NO_WAIT '0/693FF800';
>
> Notice: WAITLSN will release on PostmasterDeath or Interruption events
> if they come earlier then target LSN or timeout.
>
>
> Thank you for reading, will be glad to get your feedback.

Could you please rebase your patch as it no longer applies cleanly.

Thanks

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2017-01-23 Thread Ivan Kartyshov

Thank you for reviews and suggested improvements.
I rewrote patch to make it more stable.

Changes
===
I've made a few changes:
1) WAITLSN now doesn`t depend on snapshot
2) Check current replayed LSN rather than in xact_redo_commit
3) Add syntax WAITLSN_INFINITE '0/693FF800' - for infinite wait and
WAITLSN_NO_WAIT '0/693FF800' for check if LSN was replayed as you
advised.
4) Reduce the count of loops with GUCs (WalRcvForceReply() which in 9.5
doesn`t exist).
5) Optimize loop that set latches.
6) Add two GUCs that helps us to configure influence on StartupXLOG:
count_waitlsn (denominator to check not each LSN)
interval_waitlsn (Interval in milliseconds to additional LSN check)

Feedback

On 09/15/2016 05:41 AM, Thomas Munro wrote:

You hold a spinlock in one arbitrary slot, but that
doesn't seem sufficient: another backend may also read it, compute a
new value and then write it, while holding a different spin lock.  Or
am I missing something?


We acquire an individual spinlock on each member of array, so you cannot
compute new value and write it concurrently.

Tested
==
We have been tested it on different servers and OS`s, in different cases 
and workloads. New version is nearly as fast as vanilla on primary and 
bring tiny influence on standby performance.


Hardware:
144 Intel Cores with HT
3TB RAM
all data on ramdisk
primary + hotstandby  on the same node.

A dataset was created with "pgbench -i -s 1000" command. For each round 
of test we pause replay on standby, make 100 transaction on primary 
with pgbench, start replay on standby and measure replication gap 
disappearing time under different standby workload. The workload was 
"WAITLSN ('Very/FarLSN', 1000ms timeout)" followed by "select abalance 
from pgbench_accounts there aid = random_aid;"

For vanilla 1000ms timeout was enforced on pgbench side by -R option.
GUC waitlsn parameters was adopted for 1000ms timeout on standby with 
35000 tps rate on primary.

interval_waitlsn = 500 (ms)
count_waitlsn = 3

On 200 clients, slave caching up master as vanilla without significant 
delay.

On 500 clients, slave caching up master 3% slower then vanilla.
On 1000 clients, 12% slower.
On 5000 clients, 3 time slower because it far above our hardware ability.

How to use it
==
WAITLSN ‘LSN’ [, timeout in ms];
WAITLSN_INFINITE ‘LSN’;
WAITLSN_NO_WAIT ‘LSN’;

#Wait until LSN 0/303EC60 will be replayed, or 10 second passed.
WAITLSN ‘0/303EC60’, 1;

#Or same without timeout.
WAITLSN ‘0/303EC60’;
orfile:///home/vis/Downloads/waitlsn_10dev_v2.patch
WAITLSN_INFINITE '0/693FF800';

#To check if LSN is replayed can be used.
WAITLSN_NO_WAIT '0/693FF800';

Notice: WAITLSN will release on PostmasterDeath or Interruption events
if they come earlier then target LSN or timeout.


Thank you for reading, will be glad to get your feedback.

--
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 77667bdebd..72c5390695 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -172,6 +172,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml
index 8acdff1393..3733ad960b 100644
--- a/doc/src/sgml/reference.sgml
+++ b/doc/src/sgml/reference.sgml
@@ -200,6 +200,7 @@



+   
 
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index aa9ee5a0dd..9696b5dbb5 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -39,6 +39,7 @@
 #include "catalog/pg_control.h"
 #include "catalog/pg_database.h"
 #include "commands/tablespace.h"
+#include "commands/waitlsn.h"
 #include "miscadmin.h"
 #include "pgstat.h"
 #include "port/atomics.h"
@@ -143,6 +144,9 @@ const struct config_enum_entry sync_method_options[] = {
 	{NULL, 0, false}
 };
 
+/* GUC variable */
+intcount_waitlsn = 10;
+intinterval_waitlsn = 100;
 
 /*
  * Although only "on", "off", and "always" are documented,
@@ -6781,6 +6785,8 @@ StartupXLOG(void)
 		{
 			ErrorContextCallback errcallback;
 			TimestampTz xtime;
+			TimestampTz			time_waitlsn = GetCurrentTimestamp();
+			int	counter_waitlsn = 0;
 
 			InRedo = true;
 
@@ -6998,6 +7004,17 @@ StartupXLOG(void)
 	break;
 }
 
+/*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+if (counter_waitlsn % count_waitlsn == 0
+	|| TimestampDifferenceExceeds(time_waitlsn,GetCurrentTimestamp(),interval_waitlsn))
+{
+	WaitLSNSetLatch();
+	time_waitlsn = GetCurrentTimestamp();
+}
+counter_waitlsn++;
+
 /* Else, try to fetch the next WAL record */
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
diff --git a/src/backend/commands/Makefile b/src/backend/commands/Makefile
index 6b3742c0a0..091cbe22a0 100644

Re: [HACKERS] make async slave to wait for lsn to be replayed

2016-09-27 Thread Thomas Munro
On Thu, Sep 15, 2016 at 2:41 PM, Thomas Munro
 wrote:
> On Thu, Sep 1, 2016 at 2:16 AM, Ivan Kartyshov
>  wrote:
>> Hi hackers,
>>
>> Few days earlier I've finished my work on WAITLSN statement utility, so I’d
>> like to share it.
>> [...]
>> Your feedback is welcome!
>>
>> [waitlsn_10dev.patch]
>
> Thanks for working on this.  Here are some general thoughts on the
> feature, and an initial review.

Hi Ivan

I'm marking the patch Returned with Feedback, since there hasn't been
any response or a new patch.  I encourage you to keep working on this
feature, and I'll be happy to review future patches.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2016-09-14 Thread Thomas Munro
On Thu, Sep 1, 2016 at 2:16 AM, Ivan Kartyshov
 wrote:
> Hi hackers,
>
> Few days earlier I've finished my work on WAITLSN statement utility, so I’d
> like to share it.
> [...]
> Your feedback is welcome!
>
> [waitlsn_10dev.patch]

Hi Ivan,

Thanks for working on this.  Here are some general thoughts on the
feature, and an initial review.

+1 for this feature.  Explicitly waiting for a given commit to be
applied is one of several approaches to achieve "causal consistency"
for reads on replica nodes, and I think it will be very useful if
combined with a convenient way to get the values to wait for when you
run COMMIT.  This could be used either by applications directly, or by
middleware that somehow keeps track of dependencies between
transactions and inserts waits.

I liked the way Heikki Linnakangas imagined this feature[1]:

  BEGIN WAIT FOR COMMIT 1234 TO BE VISIBLE;

... or perhaps it could be spelled like this:

  BEGIN [isolation stuff] WAIT FOR COMMIT TOKEN  TIMEOUT ;

That allows waiting only at the start of a transaction, whereas your
idea of making a utility command would allow a single READ COMMITTED
transaction to wait multiple times for transactions it has heard about
through side channels, which may be useful.  Perhaps we could support
the same syntax in a stand alone statement inside a transaction OR as
part of a BEGIN ... statement.  Being able to do it as part of BEGIN
means that you can use this feature for single-snapshot transactions,
ie REPEATABLE READ and SERIALIZABLE (of course you can't use
SERIALIZABLE on hot standbys yet but that'll be fixed one day).
Otherwise you'd be waiting for the LSN in the middle of your
transaction but not be able to see the result because you don't take a
new snapshot.  Or maybe it's enough to use a standalone WAIT ...
statement inside a REPEATABLE READ or SERIALIZABLE transaction as long
as it's the first statement, and should be an error to do so any time
later?

I think working in terms of LSNs or XIDs explicitly is not a good
idea: encouraging clients to think in terms of anything other than
opaque 'commit tokens' seems like a bad idea because it limits future
changes.  For example, there is on-going discussion about introducing
CSNs (commit sequence numbers), and there are some related concepts
lurking in the SSI code; maybe we'd want to use those one day.  Do you
think it would make sense to have a concept of a commit token that is
a non-analysable string as far as clients are concerned, so that
clients are not encouraged to do anything at all with them except use
them in a WAIT FOR COMMIT TOKEN  statement?

INITIAL FEEDBACK ON THE PATCH

I didn't get as far as testing or detailed review because it has some
obvious bugs and compiler warnings which I figured we should talk
about first, and I also have some higher level questions about the
design.

gram.y:12882:15: error: assignment makes pointer from integer without
a cast [-Werror=int-conversion]
  n->delay = $3;

It looks like struct WaitLSNStmt accidentally has 'delay' as a pointer
to int.  Perhaps you want an int?  Maybe it would be useful to include
the units (millisecond, ms) in the name?

waitlsn.c: In function 'WLDisownLatch':
waitlsn.c:82:2: error: suggest parentheses around assignment used as
truth value [-Werror=parentheses]
  if (MyBackendId = state->backend_maxid)
  ^~

Pretty sure you want == here.

waitlsn.c: In function 'WaitLSNUtility':
waitlsn.c:153:17: error: initialization makes integer from pointer
without a cast [-Werror=int-conversion]
  int   tdelay = delay;
 ^

Another place where I think you wanted an int but used a pointer to
int?  To fix that warning you need tdelay = *delay, but I think delay
should really not be taken by pointer at all.

@@ -6922,6 +6923,11 @@ StartupXLOG(void)
+ /*
+ * After update lastReplayedEndRecPtr set Latches in SHMEM array
+ */
+ WaitLSNSetLatch();
+

I think you should try to do this only after commit records are
replayed, not after every record.  Only commit records can make
transactions visible, and the raison d'être for this feature is to let
users wait for transactions they know about to become visible.  You
probably can't do it directly in xact_redo_commit though, because at
that point XLogCtl->lastReplayedEndRecPtr hasn't been updated yet so a
backend that wakes up might not see that it has advanced and go back
to sleep.  It is updated in the StartupXLOG loop after the redo
function runs.  That is the reason why WalRcvForceReply() is called
from there rather than in xact_redo_commit, to implement remote_apply
for replication.  Perhaps you need something similar?

+ tdelay -= (GetCurrentTimestamp() - timer);

You can't do arithmetic with TimestampTz like this.  Depending on
configure option --disable-integer-datetimes (which controls macro
HAVE_INT64_TIMESTAMP), it may be a floating point number of seconds
since the epoch, or an integer number of microseconds since the epoch.
It looks 

Re: [HACKERS] make async slave to wait for lsn to be replayed

2016-09-06 Thread Ivan Kartyshov

On 08/31/2016 05:54 PM, Craig Ringer wrote:

How do you get the commit LSN to watch for? Grab
pg_current_xlog_insert_location() just after the commit and figure
that replaying to that point guarantees you get the commit?


That's the point, it was created in order to provide the cosistent view 
of data between master and replica. You almost guessed, I used 
GetXLogReplayRecPtr() right after LSN was physically replayed on downstream.



Some time ago[1] I raised the idea of reporting commit LSN on the wire
to clients. That didn't go anywhere due to compatibility and security
concerns. I think those were resolvable, but it wasn't enough of a
priority to push hard on at the time. A truly "right" solution has to
wait for a protocol bump, but I think good-enough solutions are
possible now. So you might want to read that thread.


Thank you for pointing to your thread, it was very informative!
It seems that you have solved the very similar problem.


It also mentions hesitations about exposing LSN to clients even more.
I think we're *way* past that now - we have replication origins and
replication slots relying on it, it's exposed in a pg_lsn datatype, a
bunch of views expose it, etc. But it might be reasonable to ask
"should the client instead be expected to wait for the confirmed
commit of a 64-bit epoch-extended xid, like that returned by
txid_current()?" . One advantage of using xid is that you can get it
while you're still in the xact, so there's no race between commit and
checking the lsn after commit.


That sounds reasonable, but I dont think it will give us any
considerable benefits. But I`ll work out this variant.


Are you specifically trying to ensure "this commit has replayed on the
replica before we run queries on it" ? Or something else?


Yes you are right, I want to ensure data consistency on downstream 
before running queries on it. Our clients would use it as a part of 
background worker and maybe directly in apps too.



---
Ivan Kartyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make async slave to wait for lsn to be replayed

2016-08-31 Thread Craig Ringer
On 31 August 2016 at 22:16, Ivan Kartyshov  wrote:

> Our clients who deal with 9.5 and use asynchronous master-slave replication,
> asked to make the wait-mechanism on the slave side to prevent the situation
> when slave handles query which needs data (LSN) that was received, flushed,
> but still not replayed.

I like the broad idea - I've wanted something like it for a while. BDR
has pg_xlog_wait_remote_receive() and pg_xlog_wait_remote_apply() for
use in tests for this reason, but they act on the *upstream* side,
waiting until the downstream has acked the data. Not as useful for
ensuring that apps connected to both master and one or more replicas
get a consistent view of data.

How do you get the commit LSN to watch for? Grab
pg_current_xlog_insert_location() just after the commit and figure
that replaying to that point guarantees you get the commit?

Some time ago[1] I raised the idea of reporting commit LSN on the wire
to clients. That didn't go anywhere due to compatibility and security
concerns. I think those were resolvable, but it wasn't enough of a
priority to push hard on at the time. A truly "right" solution has to
wait for a protocol bump, but I think good-enough solutions are
possible now. So you might want to read that thread.

It also mentions hesitations about exposing LSN to clients even more.
I think we're *way* past that now - we have replication origins and
replication slots relying on it, it's exposed in a pg_lsn datatype, a
bunch of views expose it, etc. But it might be reasonable to ask
"should the client instead be expected to wait for the confirmed
commit of a 64-bit epoch-extended xid, like that returned by
txid_current()?" . One advantage of using xid is that you can get it
while you're still in the xact, so there's no race between commit and
checking the lsn after commit.

Are you specifically trying to ensure "this commit has replayed on the
replica before we run queries on it" ? Or something else?

(Also, on a side note, Kevin mentioned that it may be possible to use
SSI data to achieve SERIALIZABLE read-only queries on replicas, where
they get the same protection from commit-order related anomalies as
queries on the master. You might want to look more deeply into that
too at some stage, if you're trying to ensure the app can do read only
queries on the master and expect fully consistent results).

[1] 
https://www.postgresql.org/message-id/flat/53E41EC1.5050603%402ndquadrant.com#53e41ec1.5050...@2ndquadrant.com

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers