Re: [HACKERS] tracking commit timestamps

2015-03-09 Thread Alvaro Herrera
Petr Jelinek wrote:

 On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com
 wrote:

 I got the following assertion failure when I executed
 pg_xact_commit_timestamp()
 in the standby server.
 
 =# select pg_xact_commit_timestamp('1000'::xid);
 TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
 ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
 Line: 315)

 Attached patch fixes it, I am not sure how happy I am with the way I did
 it though.

Pushed, thanks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2015-01-22 Thread Petr Jelinek

On 05/01/15 17:50, Petr Jelinek wrote:

On 05/01/15 16:17, Petr Jelinek wrote:

On 05/01/15 07:28, Fujii Masao wrote:

On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com
wrote:

On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Pushed with some extra cosmetic tweaks.


I got the following assertion failure when I executed
pg_xact_commit_timestamp()
in the standby server.

=# select pg_xact_commit_timestamp('1000'::xid);
TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
Line: 315)
server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
The connection to the server was lost. Attempting reset: 2014-12-04
12:01:08 JST sby1 LOG:  server process (PID 15545) was terminated by
signal 6: Aborted
2014-12-04 12:01:08 JST sby1 DETAIL:  Failed process was running:
select pg_xact_commit_timestamp('1000'::xid);

The way to reproduce this problem is

#1. set up and start the master and standby servers with
track_commit_timestamp disabled
#2. enable track_commit_timestamp in the master and restart the master
#3. run some write transactions
#4. enable track_commit_timestamp in the standby and restart the
standby
#5. execute select pg_xact_commit_timestamp('1000'::xid) in the
standby

BTW, at the step #4, I got the following log messages. This might be
a hint for
this problem.

LOG:  file pg_commit_ts/ doesn't exist, reading as zeroes
CONTEXT:  xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09;
inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache
45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608
relcache 16384


This problem still happens in the master.

Regards,



Attached patch fixes it, I am not sure how happy I am with the way I did
it though.



Added more comments and made the error message bit clearer.



Fujii, Alvaro, did one of you had chance to look at this fix?


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2015-01-06 Thread Robert Haas
On Tue, Jan 6, 2015 at 2:58 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 So, we would need additional information other than the node ID *and*
 the timestamp to ensure proper transaction commit ordering on Windows.
 That's not cool and makes this feature very limited on this platform.

You can't use the timestamp alone for commit ordering on any platform.
Eventually, two transactions will manage to commit in a single clock
tick, no matter how short that is.

Now, if we'd included the LSN in there...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2015-01-05 Thread Michael Paquier
On Fri, Dec 19, 2014 at 3:53 PM, Noah Misch n...@leadboat.com wrote:
 localhost template1=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from 
 generate_series(0,7) t(n);
 clock_timestamp| pg_sleep
 ---+--
  2014-12-18 08:34:34.522126+00 |
  2014-12-18 08:34:34.522126+00 |
  2014-12-18 08:34:34.631508+00 |
  2014-12-18 08:34:34.631508+00 |
  2014-12-18 08:34:34.74089+00  |
  2014-12-18 08:34:34.74089+00  |
  2014-12-18 08:34:34.850272+00 |
  2014-12-18 08:34:34.850272+00 |
 (8 rows)
So, we would need additional information other than the node ID *and*
the timestamp to ensure proper transaction commit ordering on Windows.
That's not cool and makes this feature very limited on this platform.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2015-01-05 Thread Petr Jelinek

On 05/01/15 16:17, Petr Jelinek wrote:

On 05/01/15 07:28, Fujii Masao wrote:

On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com
wrote:

On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Pushed with some extra cosmetic tweaks.


I got the following assertion failure when I executed
pg_xact_commit_timestamp()
in the standby server.

=# select pg_xact_commit_timestamp('1000'::xid);
TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
Line: 315)
server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
The connection to the server was lost. Attempting reset: 2014-12-04
12:01:08 JST sby1 LOG:  server process (PID 15545) was terminated by
signal 6: Aborted
2014-12-04 12:01:08 JST sby1 DETAIL:  Failed process was running:
select pg_xact_commit_timestamp('1000'::xid);

The way to reproduce this problem is

#1. set up and start the master and standby servers with
track_commit_timestamp disabled
#2. enable track_commit_timestamp in the master and restart the master
#3. run some write transactions
#4. enable track_commit_timestamp in the standby and restart the standby
#5. execute select pg_xact_commit_timestamp('1000'::xid) in the
standby

BTW, at the step #4, I got the following log messages. This might be
a hint for
this problem.

LOG:  file pg_commit_ts/ doesn't exist, reading as zeroes
CONTEXT:  xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09;
inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache
45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608
relcache 16384


This problem still happens in the master.

Regards,



Attached patch fixes it, I am not sure how happy I am with the way I did
it though.



Added more comments and made the error message bit clearer.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index ca074da..59d19a0 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -557,6 +557,12 @@ StartupCommitTs(void)
 	TransactionId xid = ShmemVariableCache-nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
 
+	if (track_commit_timestamp)
+	{
+		ActivateCommitTs();
+		return;
+	}
+
 	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
 
 	/*
@@ -571,14 +577,30 @@ StartupCommitTs(void)
  * This must be called ONCE during postmaster or standalone-backend startup,
  * when commit timestamp is enabled.  Must be called after recovery has
  * finished.
+ */
+void
+CompleteCommitTsInitialization(void)
+{
+	if (!track_commit_timestamp)
+		DeactivateCommitTs(true);
+}
+
+/*
+ * This must be called when track_commit_timestamp is turned on.
+ * Note that this only happens during postmaster or standalone-backend startup
+ * or during WAL replay.
+ *
+ * The reason why this SLRU needs separate activation/deactivation functions is
+ * that it can be enabled/disabled during start and the activation/deactivation
+ * on master is propagated to slave via replay. Other SLRUs don't have this
+ * property and they can be just initialized during normal startup.
  *
  * This is in charge of creating the currently active segment, if it's not
  * already there.  The reason for this is that the server might have been
  * running with this module disabled for a while and thus might have skipped
  * the normal creation point.
  */
-void
-CompleteCommitTsInitialization(void)
+void ActivateCommitTs(void)
 {
 	TransactionId xid = ShmemVariableCache-nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
@@ -591,22 +613,6 @@ CompleteCommitTsInitialization(void)
 	LWLockRelease(CommitTsControlLock);
 
 	/*
-	 * If this module is not currently enabled, make sure we don't hand back
-	 * possibly-invalid data; also remove segments of old data.
-	 */
-	if (!track_commit_timestamp)
-	{
-		LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
-		ShmemVariableCache-oldestCommitTs = InvalidTransactionId;
-		ShmemVariableCache-newestCommitTs = InvalidTransactionId;
-		LWLockRelease(CommitTsLock);
-
-		TruncateCommitTs(ReadNewTransactionId());
-
-		return;
-	}
-
-	/*
 	 * If CommitTs is enabled, but it wasn't in the previous server run, we
 	 * need to set the oldest and newest values to the next Xid; that way, we
 	 * will not try to read data that might not have been set.
@@ -641,6 +647,35 @@ CompleteCommitTsInitialization(void)
 }
 
 /*
+ * This must be called when track_commit_timestamp is turned off.
+ * Note that this only happens during postmaster or standalone-backend startup
+ * or during WAL replay.
+ *
+ * Resets CommitTs into invalid state to 

Re: [HACKERS] tracking commit timestamps

2015-01-05 Thread Petr Jelinek

On 05/01/15 07:28, Fujii Masao wrote:

On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote:

On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Pushed with some extra cosmetic tweaks.


I got the following assertion failure when I executed pg_xact_commit_timestamp()
in the standby server.

=# select pg_xact_commit_timestamp('1000'::xid);
TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
Line: 315)
server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
The connection to the server was lost. Attempting reset: 2014-12-04
12:01:08 JST sby1 LOG:  server process (PID 15545) was terminated by
signal 6: Aborted
2014-12-04 12:01:08 JST sby1 DETAIL:  Failed process was running:
select pg_xact_commit_timestamp('1000'::xid);

The way to reproduce this problem is

#1. set up and start the master and standby servers with
track_commit_timestamp disabled
#2. enable track_commit_timestamp in the master and restart the master
#3. run some write transactions
#4. enable track_commit_timestamp in the standby and restart the standby
#5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby

BTW, at the step #4, I got the following log messages. This might be a hint for
this problem.

LOG:  file pg_commit_ts/ doesn't exist, reading as zeroes
CONTEXT:  xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09;
inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache
45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608
relcache 16384


This problem still happens in the master.

Regards,



Attached patch fixes it, I am not sure how happy I am with the way I did 
it though.


And while at it I noticed that redo code for XLOG_PARAMETER_CHANGE sets
ControlFile-wal_log_hints = wal_log_hints;
shouldn't it be
ControlFile-wal_log_hints = xlrec.wal_log_hints;
instead?

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index ca074da..fcfccf8 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -557,6 +557,12 @@ StartupCommitTs(void)
 	TransactionId xid = ShmemVariableCache-nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
 
+	if (track_commit_timestamp)
+	{
+		ActivateCommitTs();
+		return;
+	}
+
 	LWLockAcquire(CommitTsControlLock, LW_EXCLUSIVE);
 
 	/*
@@ -571,14 +577,25 @@ StartupCommitTs(void)
  * This must be called ONCE during postmaster or standalone-backend startup,
  * when commit timestamp is enabled.  Must be called after recovery has
  * finished.
+ */
+void
+CompleteCommitTsInitialization(void)
+{
+	if (!track_commit_timestamp)
+		DeactivateCommitTs(true);
+}
+
+/*
+ * This must be called when track_commit_timestamp is turned on.
+ * Note that this only happens during postmaster or standalone-backend startup
+ * or during WAL replay.
  *
  * This is in charge of creating the currently active segment, if it's not
  * already there.  The reason for this is that the server might have been
  * running with this module disabled for a while and thus might have skipped
  * the normal creation point.
  */
-void
-CompleteCommitTsInitialization(void)
+void ActivateCommitTs(void)
 {
 	TransactionId xid = ShmemVariableCache-nextXid;
 	int			pageno = TransactionIdToCTsPage(xid);
@@ -591,22 +608,6 @@ CompleteCommitTsInitialization(void)
 	LWLockRelease(CommitTsControlLock);
 
 	/*
-	 * If this module is not currently enabled, make sure we don't hand back
-	 * possibly-invalid data; also remove segments of old data.
-	 */
-	if (!track_commit_timestamp)
-	{
-		LWLockAcquire(CommitTsLock, LW_EXCLUSIVE);
-		ShmemVariableCache-oldestCommitTs = InvalidTransactionId;
-		ShmemVariableCache-newestCommitTs = InvalidTransactionId;
-		LWLockRelease(CommitTsLock);
-
-		TruncateCommitTs(ReadNewTransactionId());
-
-		return;
-	}
-
-	/*
 	 * If CommitTs is enabled, but it wasn't in the previous server run, we
 	 * need to set the oldest and newest values to the next Xid; that way, we
 	 * will not try to read data that might not have been set.
@@ -641,6 +642,35 @@ CompleteCommitTsInitialization(void)
 }
 
 /*
+ * This must be called when track_commit_timestamp is turned off.
+ * Note that this only happens during postmaster or standalone-backend startup
+ * or during WAL replay.
+ *
+ * Resets CommitTs into invalid state to make sure we don't hand back
+ * possibly-invalid data; also removes segments of old data.
+ */
+void
+DeactivateCommitTs(bool do_wal)
+{
+	TransactionId xid = ShmemVariableCache-nextXid;
+	

Re: [HACKERS] tracking commit timestamps

2015-01-04 Thread Craig Ringer
On 12/19/2014 02:53 PM, Noah Misch wrote:
 The test assumed that no two transactions of a given backend will get the same
 timestamp value from now().  That holds so long as ticks of the system time
 are small enough.  Not so on at least some Windows configurations.

Most Windows systems with nothing else running will have 15 ms timer
granularity. So multiple timestamps allocated within the same
millisecond will have the same value for timestamps captured within that
interval.

If you're running other programs that use the multimedia timer APIs
(including Google Chrome, MS SQL Server, and all sorts of other apps you
might not expect) you'll probably have 1ms timer granularity instead.

Since PostgreSQL 9.4 and below capture time on Windows using
GetSystemTime the sub-millisecond part is lost anyway. On 9.5 it's
retained but will usually be some fixed value because the timer tick is
still 1ms.

If you're on Windows 8 or Windows 2012 and running PostgreSQL 9.5
(master), but not earlier versions, you'll get sub-microsecond
resolution like on sensible platforms.

Some details here: https://github.com/2ndQuadrant/pg_sysdatetime

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2015-01-04 Thread Fujii Masao
On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Pushed with some extra cosmetic tweaks.

 I got the following assertion failure when I executed 
 pg_xact_commit_timestamp()
 in the standby server.

 =# select pg_xact_commit_timestamp('1000'::xid);
 TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
 ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
 Line: 315)
 server closed the connection unexpectedly
 This probably means the server terminated abnormally
 before or while processing the request.
 The connection to the server was lost. Attempting reset: 2014-12-04
 12:01:08 JST sby1 LOG:  server process (PID 15545) was terminated by
 signal 6: Aborted
 2014-12-04 12:01:08 JST sby1 DETAIL:  Failed process was running:
 select pg_xact_commit_timestamp('1000'::xid);

 The way to reproduce this problem is

 #1. set up and start the master and standby servers with
 track_commit_timestamp disabled
 #2. enable track_commit_timestamp in the master and restart the master
 #3. run some write transactions
 #4. enable track_commit_timestamp in the standby and restart the standby
 #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby

 BTW, at the step #4, I got the following log messages. This might be a hint 
 for
 this problem.

 LOG:  file pg_commit_ts/ doesn't exist, reading as zeroes
 CONTEXT:  xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09;
 inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache
 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608
 relcache 16384

This problem still happens in the master.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-18 Thread Noah Misch
On Tue, Dec 16, 2014 at 01:05:31AM -0300, Alvaro Herrera wrote:
 Noah Misch wrote:
  On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote:
 
   FWIW, I just tried that with MinGW-32 and I can see the error on Win7.
   I also checked that changing  now() to = now() fixed the
   problem, so your assumption was right, Petr.
  
  Committed, after fixing the alternate expected output.
 
 Thanks.  I admit I don't understand what the issue is.

The test assumed that no two transactions of a given backend will get the same
timestamp value from now().  That holds so long as ticks of the system time
are small enough.  Not so on at least some Windows configurations.  Notice the
repeated timestamp values:

Windows Server 2003 x64, 32-bit build w/ VS2010

localhost template1=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from 
generate_series(0,7) t(n);
clock_timestamp| pg_sleep
---+--
 2014-12-18 08:34:34.522126+00 |
 2014-12-18 08:34:34.522126+00 |
 2014-12-18 08:34:34.631508+00 |
 2014-12-18 08:34:34.631508+00 |
 2014-12-18 08:34:34.74089+00  |
 2014-12-18 08:34:34.74089+00  |
 2014-12-18 08:34:34.850272+00 |
 2014-12-18 08:34:34.850272+00 |
(8 rows)

GNU/Linux

[local] test=# select clock_timestamp(), pg_sleep(.1 * (n % 2)) from 
generate_series(0,7) t(n);
clock_timestamp| pg_sleep
---+--
 2014-12-19 06:49:47.590556+00 |
 2014-12-19 06:49:47.590611+00 |
 2014-12-19 06:49:47.691488+00 |
 2014-12-19 06:49:47.691508+00 |
 2014-12-19 06:49:47.801483+00 |
 2014-12-19 06:49:47.801502+00 |
 2014-12-19 06:49:47.921486+00 |
 2014-12-19 06:49:47.921505+00 |
(8 rows)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-15 Thread Michael Paquier
On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote:
 On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote:
 On 08/12/14 00:56, Noah Misch wrote:
 The commit_ts test suite gives me the attached diff on a 32-bit MinGW build
 running on 64-bit Windows Server 2003.  I have not checked other Windows
 configurations; the suite does pass on GNU/Linux.

 Hmm I wonder if  now() needs to be changed to = now() in those queries
 to make them work correctly on that plarform, I don't have machine with that
 environment handy right now, so I would appreciate if you could try that, in
 case you don't have time for that, I will try to setup something later...

 I will try that, though perhaps not until next week.

FWIW, I just tried that with MinGW-32 and I can see the error on Win7.
I also checked that changing  now() to = now() fixed the
problem, so your assumption was right, Petr.
Regards,
-- 
Michael


20141215_committs_mingw_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-15 Thread Petr Jelinek

On 15/12/14 09:12, Michael Paquier wrote:

On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote:

On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote:

On 08/12/14 00:56, Noah Misch wrote:

The commit_ts test suite gives me the attached diff on a 32-bit MinGW build
running on 64-bit Windows Server 2003.  I have not checked other Windows
configurations; the suite does pass on GNU/Linux.


Hmm I wonder if  now() needs to be changed to = now() in those queries
to make them work correctly on that plarform, I don't have machine with that
environment handy right now, so I would appreciate if you could try that, in
case you don't have time for that, I will try to setup something later...


I will try that, though perhaps not until next week.


FWIW, I just tried that with MinGW-32 and I can see the error on Win7.
I also checked that changing  now() to = now() fixed the
problem, so your assumption was right, Petr.
Regards,



Cool, thanks, I think it was the time granularity problem in Windows.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-15 Thread Noah Misch
On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote:
 On Wed, Dec 10, 2014 at 6:50 PM, Noah Misch n...@leadboat.com wrote:
  On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote:
  On 08/12/14 00:56, Noah Misch wrote:
  The commit_ts test suite gives me the attached diff on a 32-bit MinGW 
  build
  running on 64-bit Windows Server 2003.  I have not checked other Windows
  configurations; the suite does pass on GNU/Linux.
 
  Hmm I wonder if  now() needs to be changed to = now() in those 
  queries
  to make them work correctly on that plarform, I don't have machine with 
  that
  environment handy right now, so I would appreciate if you could try that, 
  in
  case you don't have time for that, I will try to setup something later...
 
  I will try that, though perhaps not until next week.
 
 FWIW, I just tried that with MinGW-32 and I can see the error on Win7.
 I also checked that changing  now() to = now() fixed the
 problem, so your assumption was right, Petr.

Committed, after fixing the alternate expected output.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-15 Thread Alvaro Herrera
Noah Misch wrote:
 On Mon, Dec 15, 2014 at 12:12:10AM -0800, Michael Paquier wrote:

  FWIW, I just tried that with MinGW-32 and I can see the error on Win7.
  I also checked that changing  now() to = now() fixed the
  problem, so your assumption was right, Petr.
 
 Committed, after fixing the alternate expected output.

Thanks.  I admit I don't understand what the issue is.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-10 Thread Noah Misch
On Mon, Dec 08, 2014 at 02:23:39AM +0100, Petr Jelinek wrote:
 On 08/12/14 00:56, Noah Misch wrote:
 The commit_ts test suite gives me the attached diff on a 32-bit MinGW build
 running on 64-bit Windows Server 2003.  I have not checked other Windows
 configurations; the suite does pass on GNU/Linux.
 
 Hmm I wonder if  now() needs to be changed to = now() in those queries
 to make them work correctly on that plarform, I don't have machine with that
 environment handy right now, so I would appreciate if you could try that, in
 case you don't have time for that, I will try to setup something later...

I will try that, though perhaps not until next week.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-07 Thread Noah Misch
On Wed, Dec 03, 2014 at 11:54:38AM -0300, Alvaro Herrera wrote:
 Pushed with some extra cosmetic tweaks.

The commit_ts test suite gives me the attached diff on a 32-bit MinGW build
running on 64-bit Windows Server 2003.  I have not checked other Windows
configurations; the suite does pass on GNU/Linux.
*** Z:/nm/postgresql/src/test/modules/commit_ts/expected/commit_timestamp.out   
2014-12-05 05:43:01.07442 +
--- Z:/nm/postgresql/src/test/modules/commit_ts/results/commit_timestamp.out
2014-12-05 08:24:13.094705200 +
***
*** 19,27 
  ORDER BY id;
   id | ?column? | ?column? | ?column? 
  +--+--+--
!   1 | t| t| t
!   2 | t| t| t
!   3 | t| t| t
  (3 rows)
  
  DROP TABLE committs_test;
--- 19,27 
  ORDER BY id;
   id | ?column? | ?column? | ?column? 
  +--+--+--
!   1 | t| f| t
!   2 | t| f| t
!   3 | t| f| t
  (3 rows)
  
  DROP TABLE committs_test;
***
*** 34,39 
  SELECT x.xid::text::bigint  0, x.timestamp  '-infinity'::timestamptz, 
x.timestamp  now() FROM pg_last_committed_xact() x;
   ?column? | ?column? | ?column? 
  --+--+--
!  t| t| t
  (1 row)
  
--- 34,39 
  SELECT x.xid::text::bigint  0, x.timestamp  '-infinity'::timestamptz, 
x.timestamp  now() FROM pg_last_committed_xact() x;
   ?column? | ?column? | ?column? 
  --+--+--
!  t| t| f
  (1 row)
  

==


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-07 Thread Petr Jelinek

On 08/12/14 00:56, Noah Misch wrote:

On Wed, Dec 03, 2014 at 11:54:38AM -0300, Alvaro Herrera wrote:

Pushed with some extra cosmetic tweaks.


The commit_ts test suite gives me the attached diff on a 32-bit MinGW build
running on 64-bit Windows Server 2003.  I have not checked other Windows
configurations; the suite does pass on GNU/Linux.



Hmm I wonder if  now() needs to be changed to = now() in those 
queries to make them work correctly on that plarform, I don't have 
machine with that environment handy right now, so I would appreciate if 
you could try that, in case you don't have time for that, I will try to 
setup something later...


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Alvaro Herrera
Pushed with some extra cosmetic tweaks.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Petr Jelinek

On 03/12/14 15:54, Alvaro Herrera wrote:

Pushed with some extra cosmetic tweaks.



Cool, thanks!

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Robert Haas
On Mon, Dec 1, 2014 at 5:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 I made two more changes:
 1. introduce newestCommitTs.  Original code was using lastCommitXact to
 check that no future transaction is asked for, but this doesn't really
 work if a long-running transaction is committed, because asking for
 transactions with a higher Xid but which were committed earlier would
 raise an error.

I'm kind of disappointed that, in spite of previous review comments,
this got committed with extensive use of the CommitTs naming.  I think
that's confusing, but it's also something that will be awkward if we
want to add other data, such as the much-discussed commit LSN, to the
facility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Alvaro Herrera
Robert Haas wrote:
 On Mon, Dec 1, 2014 at 5:34 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
  I made two more changes:
  1. introduce newestCommitTs.  Original code was using lastCommitXact to
  check that no future transaction is asked for, but this doesn't really
  work if a long-running transaction is committed, because asking for
  transactions with a higher Xid but which were committed earlier would
  raise an error.
 
 I'm kind of disappointed that, in spite of previous review comments,
 this got committed with extensive use of the CommitTs naming.  I think
 that's confusing, but it's also something that will be awkward if we
 want to add other data, such as the much-discussed commit LSN, to the
 facility.

I never saw a comment that CommitTs was an unwanted name.  There were
some that said that committs wasn't liked because it looked like a
misspelling, so we added an underscore -- stuff in lower case is
commit_ts everywhere.  Stuff in camel case didn't get the underscore
because it didn't seem necessary.  But other than that issue, the name
wasn't questioned, as far as I'm aware.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Robert Haas wrote:

  I'm kind of disappointed that, in spite of previous review comments,
  this got committed with extensive use of the CommitTs naming.  I think
  that's confusing, but it's also something that will be awkward if we
  want to add other data, such as the much-discussed commit LSN, to the
  facility.
 
 I never saw a comment that CommitTs was an unwanted name.  There were
 some that said that committs wasn't liked because it looked like a
 misspelling, so we added an underscore -- stuff in lower case is
 commit_ts everywhere.  Stuff in camel case didn't get the underscore
 because it didn't seem necessary.  But other than that issue, the name
 wasn't questioned, as far as I'm aware.

I found one email where you said you didn't like committs and preferred
commit_timestamp instead.  I don't see how making that change would have
made you happy wrt the concern you just expressed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Robert Haas
On Wed, Dec 3, 2014 at 2:36 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Alvaro Herrera wrote:
 Robert Haas wrote:
  I'm kind of disappointed that, in spite of previous review comments,
  this got committed with extensive use of the CommitTs naming.  I think
  that's confusing, but it's also something that will be awkward if we
  want to add other data, such as the much-discussed commit LSN, to the
  facility.

 I never saw a comment that CommitTs was an unwanted name.  There were
 some that said that committs wasn't liked because it looked like a
 misspelling, so we added an underscore -- stuff in lower case is
 commit_ts everywhere.  Stuff in camel case didn't get the underscore
 because it didn't seem necessary.  But other than that issue, the name
 wasn't questioned, as far as I'm aware.

 I found one email where you said you didn't like committs and preferred
 commit_timestamp instead.  I don't see how making that change would have
 made you happy wrt the concern you just expressed.

Fair point.

I'm still not sure we got this one right, but I don't know that I want
to spend more time wrangling about it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Fujii Masao
On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Pushed with some extra cosmetic tweaks.

I got the following assertion failure when I executed pg_xact_commit_timestamp()
in the standby server.

=# select pg_xact_commit_timestamp('1000'::xid);
TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) ==
((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c,
Line: 315)
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: 2014-12-04
12:01:08 JST sby1 LOG:  server process (PID 15545) was terminated by
signal 6: Aborted
2014-12-04 12:01:08 JST sby1 DETAIL:  Failed process was running:
select pg_xact_commit_timestamp('1000'::xid);

The way to reproduce this problem is

#1. set up and start the master and standby servers with
track_commit_timestamp disabled
#2. enable track_commit_timestamp in the master and restart the master
#3. run some write transactions
#4. enable track_commit_timestamp in the standby and restart the standby
#5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby

BTW, at the step #4, I got the following log messages. This might be a hint for
this problem.

LOG:  file pg_commit_ts/ doesn't exist, reading as zeroes
CONTEXT:  xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09;
inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache
45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7
catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608
relcache 16384

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Simon Riggs
On 4 December 2014 at 03:08, Fujii Masao masao.fu...@gmail.com wrote:

 #1. set up and start the master and standby servers with
 track_commit_timestamp disabled
 #2. enable track_commit_timestamp in the master and restart the master
 #3. run some write transactions
 #4. enable track_commit_timestamp in the standby and restart the standby
 #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby

I'm not sure what step4 is supposed to do?

Surely if steps 1-3 generate any WAL then the standby should replay
it, whether or not track_commit_timestamp is enabled.

So what effect does setting that parameter on the standby?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-03 Thread Fujii Masao
On Thu, Dec 4, 2014 at 12:58 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 4 December 2014 at 03:08, Fujii Masao masao.fu...@gmail.com wrote:

 #1. set up and start the master and standby servers with
 track_commit_timestamp disabled
 #2. enable track_commit_timestamp in the master and restart the master
 #3. run some write transactions
 #4. enable track_commit_timestamp in the standby and restart the standby
 #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby

 I'm not sure what step4 is supposed to do?

 Surely if steps 1-3 generate any WAL then the standby should replay
 it, whether or not track_commit_timestamp is enabled.

 So what effect does setting that parameter on the standby?

At least track_commit_timestamp seems to need to be enabled even in the standby
when we want to call pg_xact_commit_timestamp() and pg_last_committed_xact()
in the standby. I'm not sure if this is good design, though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-12-01 Thread Alvaro Herrera
Petr Jelinek wrote:
 On 25/11/14 16:30, Alvaro Herrera wrote:
 Petr Jelinek wrote:
 On 25/11/14 16:23, Alvaro Herrera wrote:
 Robert Haas wrote:
 
 Maybe 0 should get translated to a NULL return, instead of a bogus 
 timestamp.
 
 That's one idea --- surely no transaction is going to commit at 00:00:00
 on 2000-01-01 anymore.  Yet this is somewhat discomforting.
 
 I solved it for xids that are out of range by returning -infinity and then
 changing that to NULL in sql interface, but no idea how to do that for
 aborted transactions.
 
 I guess the idea is that we just read the value from the slru and if it
 exactly matches allballs we do the same -infinity return and translation
 to NULL.  (Do we really love this -infinity idea?  If it's just an
 internal API we can use a boolean instead.)
 
 As in returning boolean instead of void as found? That works for me
 (for the C interface).

Petr sent me privately some changes to implement this idea.  I also
reworked the tests so that they only happen on src/test/modules (getting
rid of the one in core regress) and made them work with both enabled and
disabled track_commit_timestamps; in make installcheck, they pass
regardless of the setting of the installed server, and in make check,
they run a server with the setting enabled.

I made two more changes:
1. introduce newestCommitTs.  Original code was using lastCommitXact to
check that no future transaction is asked for, but this doesn't really
work if a long-running transaction is committed, because asking for
transactions with a higher Xid but which were committed earlier would
raise an error.

2. change CommitTsControlLock to CommitTsLock as the lock that protects
the stuff we keep in ShmemVariableCache.  The Control one is for SLRU
access, and so it might be slow at times.  This is important because we
fill the checkpoint struct from values protected by that lock, so a
checkpoint might be delayed if it happens to land in the middle of a
slru IO operation.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
index 3b8241b..f0a023f 100644
--- a/contrib/pg_upgrade/pg_upgrade.c
+++ b/contrib/pg_upgrade/pg_upgrade.c
@@ -423,8 +423,10 @@ copy_clog_xlog_xid(void)
 	/* set the next transaction id and epoch of the new cluster */
 	prep_status(Setting next transaction ID and epoch for new cluster);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  \%s/pg_resetxlog\ -f -x %u \%s\,
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
+			  \%s/pg_resetxlog\ -f -x %u -c %u \%s\,
+			  new_cluster.bindir,
+			  old_cluster.controldata.chkpnt_nxtxid,
+			  old_cluster.controldata.chkpnt_nxtxid,
 			  new_cluster.pgdata);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
 			  \%s/pg_resetxlog\ -f -e %u \%s\,
diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
index 9397198..180818d 100644
--- a/contrib/pg_xlogdump/rmgrdesc.c
+++ b/contrib/pg_xlogdump/rmgrdesc.c
@@ -10,6 +10,7 @@
 
 #include access/brin_xlog.h
 #include access/clog.h
+#include access/commit_ts.h
 #include access/gin.h
 #include access/gist_private.h
 #include access/hash.h
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ab8c263..e3713d3 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2673,6 +2673,20 @@ include_dir 'conf.d'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-track-commit-timestamp xreflabel=track_commit_timestamp
+  termvarnametrack_commit_timestamp/varname (typebool/type)/term
+  indexterm
+   primaryvarnametrack_commit_timestamp/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Record commit time of transactions. This parameter
+can only be set in filenamepostgresql.conf/ file or on the server
+command line. The default value is literaloff/literal.
+   /para
+  /listitem
+ /varlistentry
+
  /variablelist
 /sect2
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index baf81ee..62ec275 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15938,6 +15938,45 @@ SELECT collation for ('foo' COLLATE de_DE);
 For example literal10:20:10,14,15/literal means
 literalxmin=10, xmax=20, xip_list=10, 14, 15/literal.
/para
+
+   para
+The functions shown in xref linkend=functions-commit-timestamp
+provide information about transactions that have been already committed.
+These functions mainly provide information about when the transactions
+were committed. They only provide useful data when
+xref linkend=guc-track-commit-timestamp configuration option is enabled
+and only for transactions that were committed after it was enabled.
+   /para
+
+   table id=functions-commit-timestamp
+titleCommitted transaction information/title
+tgroup cols=3
+ thead
+  

Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Fujii Masao
On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 And here is v10 which fixes conflicts with Heikki's WAL API changes (no
 changes otherwise).

 After some slight additional changes, here's v11, which I intend to
 commit early tomorrow.  The main change is moving the test module from
 contrib to src/test/modules.

When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(),
it always returns 2000-01-01 09:00:00+09. Is this intentional?

Can I check my understanding? Probably we cannot use this feature to calculate
the actual replication lag by, for example, comparing the result of
pg_last_committed_xact() in the master and that of
pg_last_xact_replay_timestamp()
in the standby. Because pg_last_xact_replay_timestamp() can return even
the timestamp of aborted transaction, but pg_last_committed_xact()
cannot. Right?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Alvaro Herrera
Fujii Masao wrote:
 On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  And here is v10 which fixes conflicts with Heikki's WAL API changes (no
  changes otherwise).
 
  After some slight additional changes, here's v11, which I intend to
  commit early tomorrow.  The main change is moving the test module from
  contrib to src/test/modules.
 
 When I specify the XID of the aborted transaction in 
 pg_xact_commit_timestamp(),
 it always returns 2000-01-01 09:00:00+09. Is this intentional?

Well, when a transaction has not committed, nothing is written so on
reading we get all zeroes which corresponds to the timestamp you give.
So yeah, it is intentional.  We could alternatively check pg_clog and
raise an error if the transaction is not marked either COMMITTED or
SUBCOMMITTED, but I'm not real sure there's much point.

The other option is to record a commit time for aborted transactions
too, but that doesn't seem very good either: first, this doesn't do
anything for crashed or for in-progress transactions; and second, how 
does it make sense to have a commit time for a transaction that
doesn't actually commit?

 Can I check my understanding? Probably we cannot use this feature to calculate
 the actual replication lag by, for example, comparing the result of
 pg_last_committed_xact() in the master and that of
 pg_last_xact_replay_timestamp()
 in the standby. Because pg_last_xact_replay_timestamp() can return even
 the timestamp of aborted transaction, but pg_last_committed_xact()
 cannot. Right?

I don't think it's suited for that.  I guess if you recorded the time
of the last transaction that actually committed, you can use that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Fujii Masao
On Tue, Nov 25, 2014 at 11:19 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Fujii Masao wrote:
 On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  And here is v10 which fixes conflicts with Heikki's WAL API changes (no
  changes otherwise).
 
  After some slight additional changes, here's v11, which I intend to
  commit early tomorrow.  The main change is moving the test module from
  contrib to src/test/modules.

 When I specify the XID of the aborted transaction in 
 pg_xact_commit_timestamp(),
 it always returns 2000-01-01 09:00:00+09. Is this intentional?

 Well, when a transaction has not committed, nothing is written so on
 reading we get all zeroes which corresponds to the timestamp you give.
 So yeah, it is intentional.  We could alternatively check pg_clog and
 raise an error if the transaction is not marked either COMMITTED or
 SUBCOMMITTED, but I'm not real sure there's much point.

 The other option is to record a commit time for aborted transactions
 too, but that doesn't seem very good either: first, this doesn't do
 anything for crashed or for in-progress transactions; and second, how
 does it make sense to have a commit time for a transaction that
 doesn't actually commit?

What about the PREPARE and COMMIT PREPARED transactions?
ISTM that this feature tracks neither of them.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Robert Haas
On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Fujii Masao wrote:
 On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  And here is v10 which fixes conflicts with Heikki's WAL API changes (no
  changes otherwise).
 
  After some slight additional changes, here's v11, which I intend to
  commit early tomorrow.  The main change is moving the test module from
  contrib to src/test/modules.

 When I specify the XID of the aborted transaction in 
 pg_xact_commit_timestamp(),
 it always returns 2000-01-01 09:00:00+09. Is this intentional?

 Well, when a transaction has not committed, nothing is written so on
 reading we get all zeroes which corresponds to the timestamp you give.
 So yeah, it is intentional.  We could alternatively check pg_clog and
 raise an error if the transaction is not marked either COMMITTED or
 SUBCOMMITTED, but I'm not real sure there's much point.

Maybe 0 should get translated to a NULL return, instead of a bogus timestamp.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Alvaro Herrera
Robert Haas wrote:
 On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Fujii Masao wrote:
  On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   And here is v10 which fixes conflicts with Heikki's WAL API changes (no
   changes otherwise).
  
   After some slight additional changes, here's v11, which I intend to
   commit early tomorrow.  The main change is moving the test module from
   contrib to src/test/modules.
 
  When I specify the XID of the aborted transaction in 
  pg_xact_commit_timestamp(),
  it always returns 2000-01-01 09:00:00+09. Is this intentional?
 
  Well, when a transaction has not committed, nothing is written so on
  reading we get all zeroes which corresponds to the timestamp you give.
  So yeah, it is intentional.  We could alternatively check pg_clog and
  raise an error if the transaction is not marked either COMMITTED or
  SUBCOMMITTED, but I'm not real sure there's much point.
 
 Maybe 0 should get translated to a NULL return, instead of a bogus timestamp.

That's one idea --- surely no transaction is going to commit at 00:00:00
on 2000-01-01 anymore.  Yet this is somewhat discomforting.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Petr Jelinek

On 25/11/14 16:23, Alvaro Herrera wrote:

Robert Haas wrote:

On Tue, Nov 25, 2014 at 9:19 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Fujii Masao wrote:

On Tue, Nov 25, 2014 at 7:58 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

And here is v10 which fixes conflicts with Heikki's WAL API changes (no
changes otherwise).


After some slight additional changes, here's v11, which I intend to
commit early tomorrow.  The main change is moving the test module from
contrib to src/test/modules.


When I specify the XID of the aborted transaction in pg_xact_commit_timestamp(),
it always returns 2000-01-01 09:00:00+09. Is this intentional?


Well, when a transaction has not committed, nothing is written so on
reading we get all zeroes which corresponds to the timestamp you give.
So yeah, it is intentional.  We could alternatively check pg_clog and
raise an error if the transaction is not marked either COMMITTED or
SUBCOMMITTED, but I'm not real sure there's much point.


Maybe 0 should get translated to a NULL return, instead of a bogus timestamp.


That's one idea --- surely no transaction is going to commit at 00:00:00
on 2000-01-01 anymore.  Yet this is somewhat discomforting.



I solved it for xids that are out of range by returning -infinity and 
then changing that to NULL in sql interface, but no idea how to do that 
for aborted transactions.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Alvaro Herrera
Petr Jelinek wrote:
 On 25/11/14 16:23, Alvaro Herrera wrote:
 Robert Haas wrote:

 Maybe 0 should get translated to a NULL return, instead of a bogus 
 timestamp.
 
 That's one idea --- surely no transaction is going to commit at 00:00:00
 on 2000-01-01 anymore.  Yet this is somewhat discomforting.
 
 I solved it for xids that are out of range by returning -infinity and then
 changing that to NULL in sql interface, but no idea how to do that for
 aborted transactions.

I guess the idea is that we just read the value from the slru and if it
exactly matches allballs we do the same -infinity return and translation
to NULL.  (Do we really love this -infinity idea?  If it's just an
internal API we can use a boolean instead.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Petr Jelinek

On 25/11/14 16:30, Alvaro Herrera wrote:

Petr Jelinek wrote:

On 25/11/14 16:23, Alvaro Herrera wrote:

Robert Haas wrote:



Maybe 0 should get translated to a NULL return, instead of a bogus timestamp.


That's one idea --- surely no transaction is going to commit at 00:00:00
on 2000-01-01 anymore.  Yet this is somewhat discomforting.


I solved it for xids that are out of range by returning -infinity and then
changing that to NULL in sql interface, but no idea how to do that for
aborted transactions.


I guess the idea is that we just read the value from the slru and if it
exactly matches allballs we do the same -infinity return and translation
to NULL.  (Do we really love this -infinity idea?  If it's just an
internal API we can use a boolean instead.)



As in returning boolean instead of void as found? That works for me 
(for the C interface).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Simon Riggs
On 25 November 2014 at 13:35, Fujii Masao masao.fu...@gmail.com wrote:

 Can I check my understanding? Probably we cannot use this feature to calculate
 the actual replication lag by, for example, comparing the result of
 pg_last_committed_xact() in the master and that of
 pg_last_xact_replay_timestamp()
 in the standby. Because pg_last_xact_replay_timestamp() can return even
 the timestamp of aborted transaction, but pg_last_committed_xact()
 cannot. Right?

It was intended for that, but I forgot that
pg_last_xact_replay_timestamp() includes abort as well.

I suggest we add a function that returns both the xid and timestamp on
the standby:
* pg_last_commit_replay_info() - which returns both the xid and
timestamp of the last commit replayed on standby
* then we use the xid from the standby to lookup the commit timestamp
on the master.
We then have two timestamps that refer to the same xid and can
subtract to give us an exact replication lag.

That can be done manually by user if requested.

We can also do that by sending the replay info back as a feedback
message from standby to master, so the information can be calculated
by pg_stat_replication when requested.

I'll work on that once we have this current patch committed.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Petr Jelinek

On 25/11/14 17:16, Simon Riggs wrote:

On 25 November 2014 at 13:35, Fujii Masao masao.fu...@gmail.com wrote:


Can I check my understanding? Probably we cannot use this feature to calculate
the actual replication lag by, for example, comparing the result of
pg_last_committed_xact() in the master and that of
pg_last_xact_replay_timestamp()
in the standby. Because pg_last_xact_replay_timestamp() can return even
the timestamp of aborted transaction, but pg_last_committed_xact()
cannot. Right?


It was intended for that, but I forgot that
pg_last_xact_replay_timestamp() includes abort as well.

I suggest we add a function that returns both the xid and timestamp on
the standby:
* pg_last_commit_replay_info() - which returns both the xid and
timestamp of the last commit replayed on standby
* then we use the xid from the standby to lookup the commit timestamp
on the master.
We then have two timestamps that refer to the same xid and can
subtract to give us an exact replication lag.



Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() 
on master with the xid returned by slave accomplish the same thing?



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Simon Riggs
On 25 November 2014 at 16:18, Petr Jelinek p...@2ndquadrant.com wrote:

 Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() on
 master with the xid returned by slave accomplish the same thing?

Surely the pg_last_committed_xact() will return the same value on
standby as it did on the master?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-25 Thread Michael Paquier
On Wed, Nov 26, 2014 at 1:51 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 25 November 2014 at 16:18, Petr Jelinek p...@2ndquadrant.com wrote:

 Won't the pg_last_committed_xact() on slave + pg_xact_commit_timestamp() on
 master with the xid returned by slave accomplish the same thing?

 Surely the pg_last_committed_xact() will return the same value on
 standby as it did on the master?

It should. Now it needs some extra help as well as in its current
shape this patch will WAL log a commit timestamp if the Node ID is
valid, per RecordTransactionCommit. The node ID can be set only
through CommitTsSetDefaultNodeId, which is called nowhere actually. So
if an extension or an extra library needs to do some leg work to have
to allow this information to be replayed on other nodes.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-24 Thread Alvaro Herrera
 And here is v10 which fixes conflicts with Heikki's WAL API changes (no
 changes otherwise).

After some slight additional changes, here's v11, which I intend to
commit early tomorrow.  The main change is moving the test module from
contrib to src/test/modules.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/contrib/pg_upgrade/pg_upgrade.c
--- b/contrib/pg_upgrade/pg_upgrade.c
***
*** 423,430  copy_clog_xlog_xid(void)
  	/* set the next transaction id and epoch of the new cluster */
  	prep_status(Setting next transaction ID and epoch for new cluster);
  	exec_prog(UTILITY_LOG_FILE, NULL, true,
! 			  \%s/pg_resetxlog\ -f -x %u \%s\,
! 			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
  			  new_cluster.pgdata);
  	exec_prog(UTILITY_LOG_FILE, NULL, true,
  			  \%s/pg_resetxlog\ -f -e %u \%s\,
--- 423,432 
  	/* set the next transaction id and epoch of the new cluster */
  	prep_status(Setting next transaction ID and epoch for new cluster);
  	exec_prog(UTILITY_LOG_FILE, NULL, true,
! 			  \%s/pg_resetxlog\ -f -x %u -c %u \%s\,
! 			  new_cluster.bindir,
! 			  old_cluster.controldata.chkpnt_nxtxid,
! 			  old_cluster.controldata.chkpnt_nxtxid,
  			  new_cluster.pgdata);
  	exec_prog(UTILITY_LOG_FILE, NULL, true,
  			  \%s/pg_resetxlog\ -f -e %u \%s\,
*** a/contrib/pg_xlogdump/rmgrdesc.c
--- b/contrib/pg_xlogdump/rmgrdesc.c
***
*** 10,15 
--- 10,16 
  
  #include access/brin_xlog.h
  #include access/clog.h
+ #include access/commit_ts.h
  #include access/gin.h
  #include access/gist_private.h
  #include access/hash.h
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2673,2678  include_dir 'conf.d'
--- 2673,2692 
/listitem
   /varlistentry
  
+  varlistentry id=guc-track-commit-timestamp xreflabel=track_commit_timestamp
+   termvarnametrack_commit_timestamp/varname (typebool/type)/term
+   indexterm
+primaryvarnametrack_commit_timestamp/ configuration parameter/primary
+   /indexterm
+   listitem
+para
+ Record commit time of transactions. This parameter
+ can only be set in filenamepostgresql.conf/ file or on the server
+ command line. The default value is literaloff/literal.
+/para
+   /listitem
+  /varlistentry
+ 
   /variablelist
  /sect2
  
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 15923,15928  SELECT collation for ('foo' COLLATE de_DE);
--- 15923,15960 
  For example literal10:20:10,14,15/literal means
  literalxmin=10, xmax=20, xip_list=10, 14, 15/literal.
 /para
+ 
+para
+ The functions shown in xref linkend=functions-committs
+ provide information about transactions that have been already committed.
+ These functions mainly provide information about when the transactions
+ were committed. They only provide useful data when
+ xref linkend=guc-track-commit-timestamp configuration option is enabled
+ and only for transactions that were committed after it was enabled.
+/para
+ 
+table id=functions-committs
+ titleCommitted transaction information/title
+ tgroup cols=3
+  thead
+   rowentryName/entry entryReturn Type/entry entryDescription/entry/row
+  /thead
+ 
+  tbody
+   row
+entryliteralfunctionpg_xact_commit_timestamp(parameterxid/parameter)/function/literal/entry
+entrytypetimestamp with time zone/type/entry
+entryget commit timestamp of a transaction/entry
+   /row
+   row
+entryliteralfunctionpg_last_committed_xact()/function/literal/entry
+entryparameterxid/ typexid/, parametertimestamp/ typetimestamp with time zone//entry
+entryget transaction Id and commit timestamp of latest transaction commit/entry
+   /row
+  /tbody
+ /tgroup
+/table
+ 
/sect1
  
sect1 id=functions-admin
*** a/doc/src/sgml/ref/pg_resetxlog.sgml
--- b/doc/src/sgml/ref/pg_resetxlog.sgml
***
*** 22,27  PostgreSQL documentation
--- 22,28 
   refsynopsisdiv
cmdsynopsis
 commandpg_resetxlog/command
+arg choice=optoption-c/option replaceable class=parameterxid/replaceable/arg
 arg choice=optoption-f/option/arg
 arg choice=optoption-n/option/arg
 arg choice=optoption-o/option replaceable class=parameteroid/replaceable/arg
***
*** 77,88  PostgreSQL documentation
/para
  
para
!The option-o/, option-x/, option-e/,
!option-m/, option-O/,
!and option-l/
 options allow the next OID, next transaction ID, next transaction ID's
!epoch, next and oldest multitransaction ID, next multitransaction offset, and WAL
!starting address values to be set manually.  These are only needed when
 commandpg_resetxlog/command is unable to determine appropriate values
 by reading 

Re: [HACKERS] tracking commit timestamps

2014-11-20 Thread Petr Jelinek

On 19/11/14 17:30, Steve Singer wrote:

On 11/19/2014 08:22 AM, Alvaro Herrera wrote:


I think we're overblowing the pg_upgrade issue.  Surely we don't need to
preserve commit_ts data when upgrading across major versions; and
pg_upgrade is perfectly prepared to remove old data when upgrading
(actually it just doesn't copy it; consider pg_subtrans or pg_serial,
for instance.)  If we need to change binary representation in a future
major release, we can do so without any trouble.



That sounds reasonable. I am okay with Petr removing the LSN portion
this patch.



I did that then, v9 attached with following changes:
 - removed lsn info (obviously)

 - the pg_xact_commit_timestamp and pg_last_committed_xact return NULLs 
when timestamp data was not found


 - made the default nodeid crash safe - this also makes use of the 
do_xlog parameter in TransactionTreeSetCommitTsData if nodeid is set, 
although that still does not happen without extension actually using the API


 - added some more regression tests

 - some small comment and docs adjustments based on Michael's last email

I didn't change the pg_last_committed_xact function name and I didn't 
make nodeid visible from SQL level interfaces and don't plan to in this 
patch as I think it's very premature to do so before we have some C code 
using it.


Just to explain once more and hopefully more clearly how the crash 
safety/WAL logging is handled since there was some confusion in last review:
We only do WAL logging when nodeid is also logged (when nodeid is not 0) 
because the timestamp itself can be read from WAL record of transaction 
commit so it's pointless to log another WAL record just to store the 
timestamp again.
Extension can either set default nodeid which is then logged 
transparently, or can override the default logging mechanism by calling 
TransactionTreeSetCommitTsData with whatever data it wants and do_xlog 
set to true which will then write the WAL record with this overriding 
information.
During WAL replay the commit timestamp is set from transaction commit 
record and then if committs WAL record is found it's used to overwrite 
the commit timestamp/nodeid for given xid.


Also, there is exactly one record in SLRU for each xid so there is no 
point in making the SQL interfaces return multiple results.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/Makefile b/contrib/Makefile
index b37d0dd..e331297 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -50,6 +50,7 @@ SUBDIRS = \
 		spi		\
 		tablefunc	\
 		tcn		\
+		test_committs	\
 		test_decoding	\
 		test_parser	\
 		test_shm_mq	\
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
index 3b8241b..f0a023f 100644
--- a/contrib/pg_upgrade/pg_upgrade.c
+++ b/contrib/pg_upgrade/pg_upgrade.c
@@ -423,8 +423,10 @@ copy_clog_xlog_xid(void)
 	/* set the next transaction id and epoch of the new cluster */
 	prep_status(Setting next transaction ID and epoch for new cluster);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  \%s/pg_resetxlog\ -f -x %u \%s\,
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
+			  \%s/pg_resetxlog\ -f -x %u -c %u \%s\,
+			  new_cluster.bindir,
+			  old_cluster.controldata.chkpnt_nxtxid,
+			  old_cluster.controldata.chkpnt_nxtxid,
 			  new_cluster.pgdata);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
 			  \%s/pg_resetxlog\ -f -e %u \%s\,
diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
index 9397198..e0af3cf 100644
--- a/contrib/pg_xlogdump/rmgrdesc.c
+++ b/contrib/pg_xlogdump/rmgrdesc.c
@@ -10,6 +10,7 @@
 
 #include access/brin_xlog.h
 #include access/clog.h
+#include access/committs.h
 #include access/gin.h
 #include access/gist_private.h
 #include access/hash.h
diff --git a/contrib/test_committs/.gitignore b/contrib/test_committs/.gitignore
new file mode 100644
index 000..1f95503
--- /dev/null
+++ b/contrib/test_committs/.gitignore
@@ -0,0 +1,5 @@
+# Generated subdirectories
+/log/
+/isolation_output/
+/regression_output/
+/tmp_check/
diff --git a/contrib/test_committs/Makefile b/contrib/test_committs/Makefile
new file mode 100644
index 000..2240749
--- /dev/null
+++ b/contrib/test_committs/Makefile
@@ -0,0 +1,45 @@
+# Note: because we don't tell the Makefile there are any regression tests,
+# we have to clean those result files explicitly
+EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/test_committs
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We can't support installcheck because normally installcheck users don't have
+# the required track_commit_timestamp on
+installcheck:;
+
+check: regresscheck
+
+submake-regress:
+	$(MAKE) -C 

Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Simon Riggs
On 19 November 2014 02:12, Petr Jelinek p...@2ndquadrant.com wrote:

 Maybe we need better explanation of the LSN use-case(s) to understand why it
 should be stored here and why the other solutions are significantly worse.

We should apply the same standard that has been applied elsewhere. If
someone can show some software that could actually make use of LSN and
there isn't a better way, then we can include it.

PostgreSQL isn't a place where we speculate about possible future needs.

I don't see why it should take 2+ years of prototypes, designs and
discussions to get something in from BDR, but then we simply wave a
hand and include something else at last minute without careful
thought. Even if that means that later additions might need to think
harder about upgrades.

Timestamp and nodeid are useful for a variety of cases; LSN doesn't
meet the same standard and should not be included now.

We still have many months before even beta for people that want LSN to
make a *separate* case for its inclusion as a separate feature.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Petr Jelinek

On 19/11/14 12:20, Simon Riggs wrote:

On 19 November 2014 02:12, Petr Jelinek p...@2ndquadrant.com wrote:


Maybe we need better explanation of the LSN use-case(s) to understand why it
should be stored here and why the other solutions are significantly worse.


We should apply the same standard that has been applied elsewhere. If
someone can show some software that could actually make use of LSN and
there isn't a better way, then we can include it.

...

We still have many months before even beta for people that want LSN to
make a *separate* case for its inclusion as a separate feature.



This is good point, we are not too late in the cycle that LSN couldn't 
be added later if we find that it is indeed needed (and we don't have to 
care about pg_upgrade until beta).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Alvaro Herrera
Petr Jelinek wrote:

 This is good point, we are not too late in the cycle that LSN couldn't be
 added later if we find that it is indeed needed (and we don't have to care
 about pg_upgrade until beta).

I think we're overblowing the pg_upgrade issue.  Surely we don't need to
preserve commit_ts data when upgrading across major versions; and
pg_upgrade is perfectly prepared to remove old data when upgrading
(actually it just doesn't copy it; consider pg_subtrans or pg_serial,
for instance.)  If we need to change binary representation in a future
major release, we can do so without any trouble.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Robert Haas
On Wed, Nov 19, 2014 at 8:22 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Petr Jelinek wrote:
 This is good point, we are not too late in the cycle that LSN couldn't be
 added later if we find that it is indeed needed (and we don't have to care
 about pg_upgrade until beta).

 I think we're overblowing the pg_upgrade issue.  Surely we don't need to
 preserve commit_ts data when upgrading across major versions; and
 pg_upgrade is perfectly prepared to remove old data when upgrading
 (actually it just doesn't copy it; consider pg_subtrans or pg_serial,
 for instance.)  If we need to change binary representation in a future
 major release, we can do so without any trouble.

Actually, that's a good point.  I still don't understand what the
resistance is to add something quite inexpensive that multiple people
obviously want, but at least if we don't, we still have the option to
change it later.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-19 Thread Steve Singer

On 11/19/2014 08:22 AM, Alvaro Herrera wrote:


I think we're overblowing the pg_upgrade issue.  Surely we don't need to
preserve commit_ts data when upgrading across major versions; and
pg_upgrade is perfectly prepared to remove old data when upgrading
(actually it just doesn't copy it; consider pg_subtrans or pg_serial,
for instance.)  If we need to change binary representation in a future
major release, we can do so without any trouble.



That sounds reasonable. I am okay with Petr removing the LSN portion 
this patch.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-18 Thread Petr Jelinek

On 15/11/14 13:36, Simon Riggs wrote:

On 15 November 2014 04:32, Steve Singer st...@ssinger.info wrote:


The use cases I'm talking about aren't really replication related. Often I
have come across systems that want to do something such as 'select * from
orders where X  the_last_row_I_saw order by X' and then do further
processing on the order.


Yes, existing facilities provide mechanisms for different types of
application change queues.

If you want to write a processing queue in SQL, that isn't the best
way. You'll need some way to keep track of whether or not its been
successfully processed. That's either a column in the table, or a
column in a queue table maintained by triggers, with the row write
locked on read. You can then have multiple readers from this queue
using the new SKIP LOCKED feature, which was specifically designed to
facilitate that.

Logical decoding was intended for much more than just replication. It
provides commit order access to changed data in a form that is both
usable and efficient for high volume applicatiion needs.

I don't see any reason to add LSN into a SLRU updated at commit to
support those application needs.



I am still on the fence about the LSN issue, I don't mind it from code 
perspective, it's already written anyway, but I am not sure if we really 
want it in the SLRU as Simon says.


Mainly because of three things:
One, this patch is not really feature patch, as you can do most of what 
it does via tables already, but more a performance improvement and we 
should try to make it perform as good as possible then, adding more 
things does not really improve performance (according to my benchmarks 
the performance difference with/without LSN is under 1% so it's not 
terrible, but it's there), not to mention additional disk space.


Two, the LSN use-cases seem to still be only theoretical to me, while 
the timestamp use-case has been production problem for at least a decade.


Three, even if we add LSN, I am still not convinced that the use-cases 
presented here wouldn't be better served by putting that info into 
actual table instead of SLRU - as people want to use it as filter in 
WHERE clause, somebody mentioned exporting to different db, etc.


Maybe we need better explanation of the LSN use-case(s) to understand 
why it should be stored here and why the other solutions are 
significantly worse.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-15 Thread Simon Riggs
On 15 November 2014 04:32, Steve Singer st...@ssinger.info wrote:

 The use cases I'm talking about aren't really replication related. Often I
 have come across systems that want to do something such as 'select * from
 orders where X  the_last_row_I_saw order by X' and then do further
 processing on the order.

Yes, existing facilities provide mechanisms for different types of
application change queues.

If you want to write a processing queue in SQL, that isn't the best
way. You'll need some way to keep track of whether or not its been
successfully processed. That's either a column in the table, or a
column in a queue table maintained by triggers, with the row write
locked on read. You can then have multiple readers from this queue
using the new SKIP LOCKED feature, which was specifically designed to
facilitate that.

Logical decoding was intended for much more than just replication. It
provides commit order access to changed data in a form that is both
usable and efficient for high volume applicatiion needs.

I don't see any reason to add LSN into a SLRU updated at commit to
support those application needs.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-14 Thread Robert Haas
On Thu, Nov 13, 2014 at 6:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 13 November 2014 21:24, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Ordering transactions in LSN order is very precisly the remit of the
 existing logical decoding API. Any user that wishes to see a commits
 in sequence can do so using that API. BDR already does this, as do
 other users of the decoding API. So Slony already has access to a
 useful ordering if it wishes it. We do not need to anything *on this
 patch* to enable LSNs for Slony or anyone else. I don't see any reason
 to provide the same facility twice, in two different ways.

 Perhaps you could respond more specifically to concerns expressed
 upthread, like:

 http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl

 I don't see that as a dumb argument on the face of it.

 We have a clear must have argument for timestamps to support
 replication conflicts.

 Adding LSNs, when we already have a way to access them, is much more
 of a nice to have. I don't really see it as a particularly nice to
 have either, since the SLRU is in no way ordered.

 Scope creep is a dangerous thing, so we shouldn't, and elsewhere
 don't, collect up ideas like a bag of mixed sweets. It's easy to
 overload things to the point where they don't fly at all. The whole
 point of this is that we're building something faster than trigger
 based systems.

I think that's slamming the door closed and nailing it shut behind
you.  If we add the feature without LSNs, how will someone go back and
add that later?  It would change the on-disk format of the SLRU, so to
avoid breaking pg_upgrade, someone would have to write a conversion
utility.  Even at that, it would slow pg_upgrade down when the feature
has been used.

By way of contrast, adding the feature now is quite easy.  It just
requires storing a few more bytes per transaction.

I am all in favor of incremental development when possible, but not
when it so greatly magnifies the work that needs to be done.  People
have been asking for the ability to determine the commit ordering for
years, and this is a way to do that, very inexpensively, as part of a
patch that is needed anyway.  We are not talking about loading 20 new
requirements on top of this patch; that would be intolerable.  We're
talking about adding one additional piece of information that has been
requested multiple times over the years.

The way I see it, there are at least three uses for this information.
One, trigger-based replication solutions.  While logical decoding will
doubtless be preferable, I don't think trigger-based replication
solutions will go away completely, and certainly not right away.
They've wanted this since forever.  Two, for user applications that
want to know the commit order for their own purposes, as in Steve's
example.  Three, for O(1) snapshots.  Heikki's patch to make that
happen seems to have stalled a bit, but if it's ever to go anywhere it
will need something like this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-14 Thread Simon Riggs
On 14 November 2014 17:12, Robert Haas robertmh...@gmail.com wrote:

 We are not talking about loading 20 new
 requirements on top of this patch; that would be intolerable.  We're
 talking about adding one additional piece of information that has been
 requested multiple times over the years.

The requested information is already available, as discussed. Logical
decoding adds commit ordering for *exactly* the purpose of using it
for replication, available to all solutions. This often requested
feature has now been added and doesn't need to be added twice.

So what we are discussing is adding a completely superfluous piece of
information.

Not including the LSN info does nothing to trigger based replication,
which will no doubt live on happily for many years. But adding LSN
will slow down logical replication, for no purpose at all.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-14 Thread Steve Singer

On 11/14/2014 08:21 PM, Simon Riggs wrote:

The requested information is already available, as discussed. Logical
decoding adds commit ordering for *exactly* the purpose of using it
for replication, available to all solutions. This often requested
feature has now been added and doesn't need to be added twice.

So what we are discussing is adding a completely superfluous piece of
information.

Not including the LSN info does nothing to trigger based replication,
which will no doubt live on happily for many years. But adding LSN
will slow down logical replication, for no purpose at all.



Simon,
The use cases I'm talking about aren't really replication related. Often 
I have come across systems that want to do something such as 'select * 
from orders where X  the_last_row_I_saw order by X' and then do further 
processing on the order.


This is kind of awkard to do today because you don't have a good 
candidate for 'X' to order on.   Using either a sequence or insert-row 
timestamp doesn't work well because a transaction with a lower value for 
X might end up committing after the higher value in in a query result.


Yes you could setup a logical wal slot and listen on the stream of 
inserts into your order table but thats a lot of administration overhead 
compared to just issuing an SQL query for what really is a query type 
operation.


Using the commit timestamp for my X sounded very tempting but could 
allow duplicates.


One could argue that this patch is about replication features, and 
providing commit ordering for query purposes should be a separate patch 
to add that on top of this infrastructure. I see merit to smaller more 
focused patches but that requires leaving the door open to easily 
extending things later.


It could also be that I'm the only one who wants to order and filter 
queries in this manner (but that would surprise me).  If the commit lsn 
has limited appeal and we decide we don't want it at all  then we 
shouldn't add it.  I've seen this type of requirement in a number of 
different systems at a number of different companies.  I've generally 
seen it dealt with by either selecting rows behind the last now() 
timestamp seen and then filtering out already processed rows or by 
tracking the 'processed' state of each row individually (ie performing 
an update on each row once its been processed) which performs poorly.


Steve







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-13 Thread Petr Jelinek

On 13/11/14 07:04, Michael Paquier wrote:

On Wed, Nov 12, 2014 at 10:06 PM, Petr Jelinek p...@2ndquadrant.com wrote:

Brief list of changes:
  - the commit timestamp record now stores timestamp, lsn and nodeid

Now that not only the commit timestamp is stored, calling that commit
timestamp, committs or commit_timestamp is strange, no? If this
patch is moving toward being a more complex information provider,
calling it commit information or commit data is more adapted, no?


It's not, since adding more info will break upgrades, I doubt we will 
add more anytime soon. I was thinking about it too tbh, but don't have 
better name (I don't like commit data as it seems confusing - isn't 
commit data the dml you just committed?).
Maybe commit_metadata, commit_information is probably ok also, this 
would need input from others, I am personally fine with keeping the 
commit_timestamp name too.





  - if the xid passed to get interface is out of range -infinity timestamp is
returned (I think it's bad idea to throw errors here as the valid range is
not static and same ID can start throwing errors between calls
theoretically)

Already mentioned by Jim in a previous mail: this would be better as NULL.


Yeah that make sense, I have no idea what I was thinking :)




  - renamed the sql interfaces to pg_xact_commit_timestamp,
pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't expose
the nodeid atm, I personally am not big fan of the xact but it seems more
consistent with existing naming

pg_xact_commit_timestamp and pg_xact_commit_timestamp_data are
overlapping. What's wrong with a single function able to return the
whole set (node ID, commit timetamp, commit LSN)? Let's say


That's what pg_xact_commit_timestamp_data does (it does not show nodeid 
because we agreed that it should not be exposed yet on sql level). Might 
make sense to rename, but let's wait for input about the general 
renaming point at the beginning of the mail.



pg_xact_commit_information or pg_xact_commit_data. Already mentioned,
but I also find using a SRF able to return all the available
information from a given XID value quite useful. And this does not
conflict with what is proposed currently, you would need just to call
the function with XID + number of entries wanted to get a single one.
Comments from other folks about that?


No idea what you mean by this to be honest, there is exactly one record 
stored for single XID.




Then more input about the latest patch:
1) This block is not needed, option -e is listed twice:
 The option-o/, option-x/, option-e/,
-   option-m/, option-O/,
-   and option-l/
+   option-m/, option-O/, option-l/
+   and option-e/
2) Very small thing: a couple of files have no newlines at the end,
among them committs.conf and test_committs/Makefile.
3) pg_last_committed_xact and not pg_last_xact_commit_information or similar?


Just inspiration from DB2's rollforward (which shows among other things 
last committed transaction: timestamp), but I don't feel strongly 
about naming so can be changed.



4) storage.sgml needs to be updated with the new folder pg_committs


Right.


5) Er.. node ID is missing in pg_last_committed_xact, no?


That's intentional (for now).


6) This XXX notice can be removed:
+   /*
+* Return empty if the requested value is older than what we have or
+* newer than newest we have.
+*
+* XXX: should this be error instead?
+*/


Ok.


(Note that I am still a partisan of an error message to let the
caller know that commit info history does not have the information
requested).


IMHO throwing error there would be same as throwing error when WHERE 
clause in SELECT does not match anything. As the xid range for which we 
store data is dynamic we need to accept any xid as valid input because 
the caller has no way of validating if the xid passed is out of range or 
not.



7) Note that TransactionTreeSetCommitTsData still never sets do_xlog
at true and that WriteSetTimestampXlogRec never gets called. So no
information is WAL-logged with this patch. Wouldn't that be useful for
standbys as well? Perhaps I am missing why this is disabled? This code
should be activated IMO or it would be just untested.


True is only needed here when you are setting this info to different 
transaction than the one you are in since the info can be reconstructed 
from normal transaction WAL record (see that it's actually called from 
xact_redo_commit_internal, which is how we get the WAL safety and why it 
works on slave). So the true is for use by extensions only, it's not 
completely uncommon that we have APIs that are used only by extensions.



8) As a more general point, the node ID stuff makes me uncomfortable
and is just added on top of the existing patch without much
thinking... So I am really skeptical about it. The need here is to
pass on demand a int8 that is a node ID that can only be set through a
C interface, so only extensions could play with 

Re: [HACKERS] tracking commit timestamps

2014-11-13 Thread Simon Riggs
On 9 November 2014 16:57, Steve Singer st...@ssinger.info wrote:
 On 11/07/2014 07:07 PM, Petr Jelinek wrote:


 The list of what is useful might be long, but we can't have everything
 there as there are space constraints, and LSN is another 8 bytes and I still
 want to have some bytes for storing the origin or whatever you want to
 call it there, as that's the one I personally have biggest use-case for.
 So this would be ~24bytes per txid already, hmm I wonder if we can pull
 some tricks to lower that a bit.


 The reason why Jim and myself are asking for the LSN and not just the
 timestamp is that I want to be able to order the transactions. Jim pointed
 out earlier in the thread that just ordering on timestamp allows for
 multiple transactions with the same timestamp.

I think we need to be clear about the role and function of components here.

Xid timestamps allow a replication system to do post-commit conflict
resolution based upon timestamp, i.e. last update wins. That is
potentially usable by BDR, Slony, xdb and anything else that wants
that.

Ordering transactions in LSN order is very precisly the remit of the
existing logical decoding API. Any user that wishes to see a commits
in sequence can do so using that API. BDR already does this, as do
other users of the decoding API. So Slony already has access to a
useful ordering if it wishes it. We do not need to anything *on this
patch* to enable LSNs for Slony or anyone else. I don't see any reason
to provide the same facility twice, in two different ways.

So in summary... the components are
* Commit LSN order is useful for applying changes - available by
logical decoding
* Commit timestamps and nodeid are useful for conflict resolution -
available from this patch

Both components have been designed in ways that allow multiple
replication systems to use these facilities.

So, -1 to including commit LSN in the SLRU alongside commit timestamp
and nodeid.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-13 Thread Petr Jelinek

On 13/11/14 14:18, Simon Riggs wrote:


So in summary... the components are
* Commit LSN order is useful for applying changes - available by
logical decoding
* Commit timestamps and nodeid are useful for conflict resolution -
available from this patch

Both components have been designed in ways that allow multiple
replication systems to use these facilities.

So, -1 to including commit LSN in the SLRU alongside commit timestamp
and nodeid.



I am of the same opinion, I added the LSN by popular demand, but I 
still personally don't see the value in having it there as it does *not* 
enable us to do something that would be impossible otherwise.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-13 Thread Robert Haas
On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Ordering transactions in LSN order is very precisly the remit of the
 existing logical decoding API. Any user that wishes to see a commits
 in sequence can do so using that API. BDR already does this, as do
 other users of the decoding API. So Slony already has access to a
 useful ordering if it wishes it. We do not need to anything *on this
 patch* to enable LSNs for Slony or anyone else. I don't see any reason
 to provide the same facility twice, in two different ways.

Perhaps you could respond more specifically to concerns expressed
upthread, like:

http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl

I don't see that as a dumb argument on the face of it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-13 Thread Simon Riggs
On 13 November 2014 21:24, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Nov 13, 2014 at 8:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 Ordering transactions in LSN order is very precisly the remit of the
 existing logical decoding API. Any user that wishes to see a commits
 in sequence can do so using that API. BDR already does this, as do
 other users of the decoding API. So Slony already has access to a
 useful ordering if it wishes it. We do not need to anything *on this
 patch* to enable LSNs for Slony or anyone else. I don't see any reason
 to provide the same facility twice, in two different ways.

 Perhaps you could respond more specifically to concerns expressed
 upthread, like:

 http://www.postgresql.org/message-id/blu436-smtp28b68b9312cbe5d9125441dc...@phx.gbl

 I don't see that as a dumb argument on the face of it.

We have a clear must have argument for timestamps to support
replication conflicts.

Adding LSNs, when we already have a way to access them, is much more
of a nice to have. I don't really see it as a particularly nice to
have either, since the SLRU is in no way ordered.

Scope creep is a dangerous thing, so we shouldn't, and elsewhere
don't, collect up ideas like a bag of mixed sweets. It's easy to
overload things to the point where they don't fly at all. The whole
point of this is that we're building something faster than trigger
based systems.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-12 Thread Petr Jelinek

On 10/11/14 14:53, Robert Haas wrote:

On Mon, Nov 10, 2014 at 8:39 AM, Petr Jelinek p...@2ndquadrant.com wrote:

I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes
per record, I am inclined to just say we can live with that.


If you do it as 20 bytes, you'll have to do some work to squeeze out
the alignment padding.  I'm inclined to think it's fine to have a few
extra padding bytes here; someone might want to use those for
something in the future, and they probably don't cost much.



I did get around the alignment via memcpy, so it is still 20bytes.


Since we agreed that the (B) case is not really feasible and we are doing
the (C), I also wonder if extradata should be renamed to nodeid (even if
it's not used at this point as nodeid). And then there is question about the
size of it, since the nodeid itself can live with 2 bytes probably (64k of
nodes ought to be enough for everybody ;) ).
Or leave the extradata as is but use as reserved space for future use and
not expose it at this time on SQL level at all?


I vote for calling it node-ID, and for allowing at least 4 bytes for
it.  Penny wise, pound foolish.



Ok, I went this way.

Anyway here is v8 version of the patch, I think I addressed all the 
concerns mentioned, it's also rebased against current master (BRIN 
commit added some conflicts).


Brief list of changes:
 - the commit timestamp record now stores timestamp, lsn and nodeid
 - added code to disallow turning track_commit_timestamp on with too 
small pagesize

 - the get interfaces error out when track_commit_timestamp is off
 - if the xid passed to get interface is out of range -infinity 
timestamp is returned (I think it's bad idea to throw errors here as the 
valid range is not static and same ID can start throwing errors between 
calls theoretically)
 - renamed the sql interfaces to pg_xact_commit_timestamp, 
pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't 
expose the nodeid atm, I personally am not big fan of the xact but it 
seems more consistent with existing naming
 - documented pg_resetxlog changes and make all the pg_resetxlog 
options alphabetically ordered
 - committs is not used anymore, it's commit_ts (and CommitTs in 
camelcase), I think it's not really good idea to spell the timestamp 
everywhere as some interface then get 30+ chars long names...

 - added WAL logging of the track_commit_timestamp GUC
 - added alternative expected output of the regression test so that it 
works with make installcheck when track_commit_timestamp is on

 - added C interface to set default nodeid for current backend
 - several minor comment and naming adjustments mostly suggested by Michael


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/Makefile b/contrib/Makefile
index b37d0dd..e331297 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -50,6 +50,7 @@ SUBDIRS = \
 		spi		\
 		tablefunc	\
 		tcn		\
+		test_committs	\
 		test_decoding	\
 		test_parser	\
 		test_shm_mq	\
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
index 3b8241b..f0a023f 100644
--- a/contrib/pg_upgrade/pg_upgrade.c
+++ b/contrib/pg_upgrade/pg_upgrade.c
@@ -423,8 +423,10 @@ copy_clog_xlog_xid(void)
 	/* set the next transaction id and epoch of the new cluster */
 	prep_status(Setting next transaction ID and epoch for new cluster);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
-			  \%s/pg_resetxlog\ -f -x %u \%s\,
-			  new_cluster.bindir, old_cluster.controldata.chkpnt_nxtxid,
+			  \%s/pg_resetxlog\ -f -x %u -c %u \%s\,
+			  new_cluster.bindir,
+			  old_cluster.controldata.chkpnt_nxtxid,
+			  old_cluster.controldata.chkpnt_nxtxid,
 			  new_cluster.pgdata);
 	exec_prog(UTILITY_LOG_FILE, NULL, true,
 			  \%s/pg_resetxlog\ -f -e %u \%s\,
diff --git a/contrib/pg_xlogdump/rmgrdesc.c b/contrib/pg_xlogdump/rmgrdesc.c
index 9397198..e0af3cf 100644
--- a/contrib/pg_xlogdump/rmgrdesc.c
+++ b/contrib/pg_xlogdump/rmgrdesc.c
@@ -10,6 +10,7 @@
 
 #include access/brin_xlog.h
 #include access/clog.h
+#include access/committs.h
 #include access/gin.h
 #include access/gist_private.h
 #include access/hash.h
diff --git a/contrib/test_committs/.gitignore b/contrib/test_committs/.gitignore
new file mode 100644
index 000..1f95503
--- /dev/null
+++ b/contrib/test_committs/.gitignore
@@ -0,0 +1,5 @@
+# Generated subdirectories
+/log/
+/isolation_output/
+/regression_output/
+/tmp_check/
diff --git a/contrib/test_committs/Makefile b/contrib/test_committs/Makefile
new file mode 100644
index 000..2240749
--- /dev/null
+++ b/contrib/test_committs/Makefile
@@ -0,0 +1,45 @@
+# Note: because we don't tell the Makefile there are any regression tests,
+# we have to clean those result files explicitly
+EXTRA_CLEAN = $(pg_regress_clean_files) ./regression_output ./isolation_output
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)

Re: [HACKERS] tracking commit timestamps

2014-11-12 Thread Jim Nasby

On 11/12/14, 7:06 AM, Petr Jelinek wrote:

  - if the xid passed to get interface is out of range -infinity timestamp is 
returned (I think it's bad idea to throw errors here as the valid range is not 
static and same ID can start throwing errors between calls theoretically)


Wouldn't NULL be more appropriate?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-12 Thread Michael Paquier
On Thu, Nov 13, 2014 at 7:56 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 On 11/12/14, 7:06 AM, Petr Jelinek wrote:

   - if the xid passed to get interface is out of range -infinity timestamp
 is returned (I think it's bad idea to throw errors here as the valid range
 is not static and same ID can start throwing errors between calls
 theoretically)


 Wouldn't NULL be more appropriate?
Definitely. Defining a given value for information not valid is awkward.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-12 Thread Michael Paquier
On Wed, Nov 12, 2014 at 10:06 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 Brief list of changes:
  - the commit timestamp record now stores timestamp, lsn and nodeid
Now that not only the commit timestamp is stored, calling that commit
timestamp, committs or commit_timestamp is strange, no? If this
patch is moving toward being a more complex information provider,
calling it commit information or commit data is more adapted, no?
Documentation would need a fresh brush as well in this case.

  - added code to disallow turning track_commit_timestamp on with too small
 pagesize
  - the get interfaces error out when track_commit_timestamp is off
OK, that's sane.

  - if the xid passed to get interface is out of range -infinity timestamp is
 returned (I think it's bad idea to throw errors here as the valid range is
 not static and same ID can start throwing errors between calls
 theoretically)
Already mentioned by Jim in a previous mail: this would be better as NULL.

  - renamed the sql interfaces to pg_xact_commit_timestamp,
 pg_xact_commit_timestamp_data and pg_last_committed_xact, they don't expose
 the nodeid atm, I personally am not big fan of the xact but it seems more
 consistent with existing naming
pg_xact_commit_timestamp and pg_xact_commit_timestamp_data are
overlapping. What's wrong with a single function able to return the
whole set (node ID, commit timetamp, commit LSN)? Let's say
pg_xact_commit_information or pg_xact_commit_data. Already mentioned,
but I also find using a SRF able to return all the available
information from a given XID value quite useful. And this does not
conflict with what is proposed currently, you would need just to call
the function with XID + number of entries wanted to get a single one.
Comments from other folks about that?

  - documented pg_resetxlog changes and make all the pg_resetxlog options
 alphabetically ordered
  - added WAL logging of the track_commit_timestamp GUC
  - added alternative expected output of the regression test so that it works
 with make installcheck when track_commit_timestamp is on
  - added C interface to set default nodeid for current backend
  - several minor comment and naming adjustments mostly suggested by Michael
Thanks for those adjustments.

Then more input about the latest patch:
1) This block is not needed, option -e is listed twice:
The option-o/, option-x/, option-e/,
-   option-m/, option-O/,
-   and option-l/
+   option-m/, option-O/, option-l/
+   and option-e/
2) Very small thing: a couple of files have no newlines at the end,
among them committs.conf and test_committs/Makefile.
3) pg_last_committed_xact and not pg_last_xact_commit_information or similar?
4) storage.sgml needs to be updated with the new folder pg_committs
5) Er.. node ID is missing in pg_last_committed_xact, no?
6) This XXX notice can be removed:
+   /*
+* Return empty if the requested value is older than what we have or
+* newer than newest we have.
+*
+* XXX: should this be error instead?
+*/
We are moving toward returning invalid information in the SQL
interface when the information is not in history instead of an error,
no? (Note that I am still a partisan of an error message to let the
caller know that commit info history does not have the information
requested).
7) Note that TransactionTreeSetCommitTsData still never sets do_xlog
at true and that WriteSetTimestampXlogRec never gets called. So no
information is WAL-logged with this patch. Wouldn't that be useful for
standbys as well? Perhaps I am missing why this is disabled? This code
should be activated IMO or it would be just untested.
8) As a more general point, the node ID stuff makes me uncomfortable
and is just added on top of the existing patch without much
thinking... So I am really skeptical about it. The need here is to
pass on demand a int8 that is a node ID that can only be set through a
C interface, so only extensions could play with it. The data passed to
a WAL record is always built and determined by the system and entirely
transparent to the user, inserting user-defined data like that
inconsistent with what we've been doing until now, no?

Also, a question particularly for BDR and Slony folks: do you
sometimes add a new node using the base backup of an existing node :)
See what I come up with: a duplication of this new node ID system with
the already present system ID, no?
Similarly, the LSN is added to the WAL record containing the commit
timestamp, but cannot the LSN of the WAL record containing the commit
timestamp itself be used as a point of reference for a better
ordering? That's not exactly the same as the LSN of the transaction
commit, still it provides a WAL-based reference.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Simon Riggs
On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote:

 6) Shouldn't any value update of track_commit_timestamp be tracked in
 XLogReportParameters? That's thinking about making the commit timestamp
 available on standbys as well..

 Yes, it should.

Agree committs should be able to run on standby, but it seems possible
to do that without it running on the master. The two should be
unconnected.

Not sure why we'd want to have parameter changes on master reported?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Andres Freund
On 2014-11-11 16:10:47 +, Simon Riggs wrote:
 On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote:
 
  6) Shouldn't any value update of track_commit_timestamp be tracked in
  XLogReportParameters? That's thinking about making the commit timestamp
  available on standbys as well..
 
  Yes, it should.
 
 Agree committs should be able to run on standby, but it seems possible
 to do that without it running on the master.

I don't think that's realistic. It requires WAL to be written in some
cases, so that's not going to work. I also don't think it's a
particularly interesting ability?

 The two should be unconnected.

Why?

 Not sure why we'd want to have parameter changes on master reported?

So it works correctly. We're currently truncating the slru on startup
when the guc is disabled which would cause problems WAL records coming
in from the primary. I think the code also needs some TLC to correctly
work after a failover.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Simon Riggs
On 9 November 2014 16:57, Steve Singer st...@ssinger.info wrote:
 On 11/07/2014 07:07 PM, Petr Jelinek wrote:


 The list of what is useful might be long, but we can't have everything
 there as there are space constraints, and LSN is another 8 bytes and I still
 want to have some bytes for storing the origin or whatever you want to
 call it there, as that's the one I personally have biggest use-case for.
 So this would be ~24bytes per txid already, hmm I wonder if we can pull
 some tricks to lower that a bit.


 The reason why Jim and myself are asking for the LSN and not just the
 timestamp is that I want to be able to order the transactions. Jim pointed
 out earlier in the thread that just ordering on timestamp allows for
 multiple transactions with the same timestamp.

 Maybe we don't need the entire LSN to solve that.  If you already have the
 commit timestamp maybe you only need another byte or two of granularity to
 order transactions that are within the same microsecond.

It looks like there are quite a few potential uses for this.

If we include everything it will be too fat to use for any of the
potential uses, since each will be pulled down by the others.

Sounds like it needs to be configurable in some way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Simon Riggs
On 11 November 2014 16:19, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-11 16:10:47 +, Simon Riggs wrote:
 On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote:

  6) Shouldn't any value update of track_commit_timestamp be tracked in
  XLogReportParameters? That's thinking about making the commit timestamp
  available on standbys as well..
 
  Yes, it should.

 Agree committs should be able to run on standby, but it seems possible
 to do that without it running on the master.

 I don't think that's realistic. It requires WAL to be written in some
 cases, so that's not going to work. I also don't think it's a
 particularly interesting ability?

OK, so we are saying commit timestamp will NOT be available on Standbys.

I'm fine with that, since data changes aren't generated there.


 So it works correctly. We're currently truncating the slru on startup
 when the guc is disabled which would cause problems WAL records coming
 in from the primary. I think the code also needs some TLC to correctly
 work after a failover.

OK

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Andres Freund
On 2014-11-11 17:09:54 +, Simon Riggs wrote:
 On 11 November 2014 16:19, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-11-11 16:10:47 +, Simon Riggs wrote:
  On 4 November 2014 08:23, Andres Freund and...@2ndquadrant.com wrote:
 
   6) Shouldn't any value update of track_commit_timestamp be tracked in
   XLogReportParameters? That's thinking about making the commit timestamp
   available on standbys as well..
  
   Yes, it should.
 
  Agree committs should be able to run on standby, but it seems possible
  to do that without it running on the master.
 
  I don't think that's realistic. It requires WAL to be written in some
  cases, so that's not going to work. I also don't think it's a
  particularly interesting ability?
 
 OK, so we are saying commit timestamp will NOT be available on Standbys.

Hm? They should be available - xact.c WAL replay will redo the setting
of the timestamps and explicitly overwritten timestamps will generate
their own WAL records. What I mean is just that you can't use commit
timestamps without also using it on the primary.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Jim Nasby

On 11/10/14, 7:40 AM, Alvaro Herrera wrote:

Robert Haas wrote:

On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

Robert Haas wrote:

I think the key question here is the time for which the data needs to
be retained.  2^32 of anything is a lot, but why keep around that
number of records rather than more (after all, we have epochs to
distinguish one use of a given txid from another) or fewer?


The problem is not how much data we retain; is about how much data we
can address.


I thought I was responding to a concern about disk space utilization.


Ah, right.  So AFAIK we don't need to keep anything older than
RecentXmin or something like that -- which is not too old.  If I recall
correctly Josh Berkus was saying in a thread about pg_multixact that it
used about 128kB or so in = 9.2 for his customers; that one was also
limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
would be pretty acceptable.  Moreso considering that it's turned off by
default.


FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able 
to advance datminmxid, which will (now) only happen when an entire relation has 
been scanned (which should be infrequent).

I believe the low normal space usage is just an indication that most databases 
don't use many MultiXacts.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Alvaro Herrera
Jim Nasby wrote:
 On 11/10/14, 7:40 AM, Alvaro Herrera wrote:

 Ah, right.  So AFAIK we don't need to keep anything older than
 RecentXmin or something like that -- which is not too old.  If I recall
 correctly Josh Berkus was saying in a thread about pg_multixact that it
 used about 128kB or so in = 9.2 for his customers; that one was also
 limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
 would be pretty acceptable.  Moreso considering that it's turned off by
 default.
 
 FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is 
 able to advance datminmxid, which will (now) only happen when an entire 
 relation has been scanned (which should be infrequent).
 
 I believe the low normal space usage is just an indication that most 
 databases don't use many MultiXacts.

That's in 9.3.  Prior to that, they were truncated much more often.
Maybe you've not heard enough about this commit:

commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182
Author: Alvaro Herrera alvhe...@alvh.no-ip.org
Date:   Wed Jan 23 12:04:59 2013 -0300

Improve concurrency of foreign key locking

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Jim Nasby

On 11/11/14, 2:03 PM, Alvaro Herrera wrote:

Jim Nasby wrote:

On 11/10/14, 7:40 AM, Alvaro Herrera wrote:



Ah, right.  So AFAIK we don't need to keep anything older than
RecentXmin or something like that -- which is not too old.  If I recall
correctly Josh Berkus was saying in a thread about pg_multixact that it
used about 128kB or so in = 9.2 for his customers; that one was also
limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
would be pretty acceptable.  Moreso considering that it's turned off by
default.


FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is able 
to advance datminmxid, which will (now) only happen when an entire relation has 
been scanned (which should be infrequent).

I believe the low normal space usage is just an indication that most databases 
don't use many MultiXacts.


That's in 9.3.  Prior to that, they were truncated much more often.


Well, we're talking about a new feature, so I wasn't looking in back branches. 
;P


Maybe you've not heard enough about this commit:

commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182


Interestingly, git.postgresql.org hasn't either: 
http://git.postgresql.org/gitweb/?p=postgresql.gita=searchh=HEADst=commits=0ac5ad5134f2769ccbaefec73844f8504c4d6182

The commit is certainly there though...
decibel@decina:[15:12]~/pgsql/HEAD/src/backend (master=)$git log 
0ac5ad5134f2769ccbaefec73844f8504c4d6182|head -n1
commit 0ac5ad5134f2769ccbaefec73844f8504c4d6182
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-11 Thread Alvaro Herrera
Jim Nasby wrote:
 On 11/11/14, 2:03 PM, Alvaro Herrera wrote:
 Jim Nasby wrote:
 On 11/10/14, 7:40 AM, Alvaro Herrera wrote:
 
 Ah, right.  So AFAIK we don't need to keep anything older than
 RecentXmin or something like that -- which is not too old.  If I recall
 correctly Josh Berkus was saying in a thread about pg_multixact that it
 used about 128kB or so in = 9.2 for his customers; that one was also
 limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
 would be pretty acceptable.  Moreso considering that it's turned off by
 default.
 
 FWIW, AFAICS MultiXacts are only truncated after a (auto)vacuum process is 
 able to advance datminmxid, which will (now) only happen when an entire 
 relation has been scanned (which should be infrequent).
 
 I believe the low normal space usage is just an indication that most 
 databases don't use many MultiXacts.
 
 That's in 9.3.  Prior to that, they were truncated much more often.
 
 Well, we're talking about a new feature, so I wasn't looking in back 
 branches. ;P

Well, I did mention = 9.2 in the text above ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Robert Haas
On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Robert Haas wrote:
 I think the key question here is the time for which the data needs to
 be retained.  2^32 of anything is a lot, but why keep around that
 number of records rather than more (after all, we have epochs to
 distinguish one use of a given txid from another) or fewer?

 The problem is not how much data we retain; is about how much data we
 can address.

I thought I was responding to a concern about disk space utilization.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Petr Jelinek

On 10/11/14 08:01, Anssi Kääriäinen wrote:

On Sun, 2014-11-09 at 11:57 -0500, Steve Singer wrote:


The reason why Jim and myself are asking for the LSN and not just the
timestamp is that I want to be able to order the transactions. Jim
pointed out earlier in the thread that just ordering on timestamp allows
for multiple transactions with the same timestamp.

Maybe we don't need the entire LSN to solve that.  If you already have
the commit timestamp maybe you only need another byte or two of
granularity to order transactions that are within the same microsecond.


There is no guarantee that a commit with later LSN has a later
timestamp. There are cases where the clock could move significantly
backwards.

A robust solution to storing transaction commit information (including
commit order) in a way that can be referenced from other tables, can be
loaded to another cluster, and survives crashes would be a great
feature. But this feature doesn't have those properties.



It has the property of surviving crashes.

Not sure what you mean by referencing from other tables?

And about loading to another cluster, the txid does not really have any 
meaning on another cluster, so the info about it does not have either?


But anyway this patch is targeting extensions not DBAs - you could write 
extension that will provide that feature on top of this patch (although 
given what I wrote above I don't see how it's useful).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Robert Haas
On Mon, Nov 10, 2014 at 2:01 AM, Anssi Kääriäinen
anssi.kaariai...@thl.fi wrote:
 There is no guarantee that a commit with later LSN has a later
 timestamp. There are cases where the clock could move significantly
 backwards.

Good point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Petr Jelinek

On 09/11/14 17:57, Steve Singer wrote:

On 11/07/2014 07:07 PM, Petr Jelinek wrote:

The list of what is useful might be long, but we can't have everything
there as there are space constraints, and LSN is another 8 bytes and I
still want to have some bytes for storing the origin or whatever you
want to call it there, as that's the one I personally have biggest
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can
pull some tricks to lower that a bit.



The reason why Jim and myself are asking for the LSN and not just the
timestamp is that I want to be able to order the transactions. Jim
pointed out earlier in the thread that just ordering on timestamp allows
for multiple transactions with the same timestamp.

Maybe we don't need the entire LSN to solve that.  If you already have
the commit timestamp maybe you only need another byte or two of
granularity to order transactions that are within the same microsecond.



Hmm maybe just one part of LSN, but I don't really like that either, if 
we want to store LSN we should probably store it as is as somebody might 
want to map it to txid for other reasons.


I did the calculation above wrong btw, it's actually 20 bytes not 24 
bytes per record, I am inclined to just say we can live with that.


Since we agreed that the (B) case is not really feasible and we are 
doing the (C), I also wonder if extradata should be renamed to nodeid 
(even if it's not used at this point as nodeid). And then there is 
question about the size of it, since the nodeid itself can live with 2 
bytes probably (64k of nodes ought to be enough for everybody ;) ).
Or leave the extradata as is but use as reserved space for future use 
and not expose it at this time on SQL level at all?


I guess Andres could answer what suits him better here.

--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Alvaro Herrera
Robert Haas wrote:
 On Sun, Nov 9, 2014 at 8:41 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
  Robert Haas wrote:
  I think the key question here is the time for which the data needs to
  be retained.  2^32 of anything is a lot, but why keep around that
  number of records rather than more (after all, we have epochs to
  distinguish one use of a given txid from another) or fewer?
 
  The problem is not how much data we retain; is about how much data we
  can address.
 
 I thought I was responding to a concern about disk space utilization.

Ah, right.  So AFAIK we don't need to keep anything older than
RecentXmin or something like that -- which is not too old.  If I recall
correctly Josh Berkus was saying in a thread about pg_multixact that it
used about 128kB or so in = 9.2 for his customers; that one was also
limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
would be pretty acceptable.  Moreso considering that it's turned off by
default.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Robert Haas
On Mon, Nov 10, 2014 at 8:39 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 I did the calculation above wrong btw, it's actually 20 bytes not 24 bytes
 per record, I am inclined to just say we can live with that.

If you do it as 20 bytes, you'll have to do some work to squeeze out
the alignment padding.  I'm inclined to think it's fine to have a few
extra padding bytes here; someone might want to use those for
something in the future, and they probably don't cost much.

 Since we agreed that the (B) case is not really feasible and we are doing
 the (C), I also wonder if extradata should be renamed to nodeid (even if
 it's not used at this point as nodeid). And then there is question about the
 size of it, since the nodeid itself can live with 2 bytes probably (64k of
 nodes ought to be enough for everybody ;) ).
 Or leave the extradata as is but use as reserved space for future use and
 not expose it at this time on SQL level at all?

I vote for calling it node-ID, and for allowing at least 4 bytes for
it.  Penny wise, pound foolish.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Robert Haas
On Mon, Nov 10, 2014 at 8:40 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Ah, right.  So AFAIK we don't need to keep anything older than
 RecentXmin or something like that -- which is not too old.  If I recall
 correctly Josh Berkus was saying in a thread about pg_multixact that it
 used about 128kB or so in = 9.2 for his customers; that one was also
 limited to RecentXmin AFAIR.  I think a similar volume of commit_ts data
 would be pretty acceptable.  Moreso considering that it's turned off by
 default.

I'm not sure whether keeping it just back to RecentXmin will be enough
for everybody's needs.  But we certainly don't need to keep the last
2^32 records as someone-or-other was suggesting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-10 Thread Steve Singer

On 11/10/2014 08:39 AM, Petr Jelinek wrote:

On 09/11/14 17:57, Steve Singer wrote:

On 11/07/2014 07:07 PM, Petr Jelinek wrote:

The list of what is useful might be long, but we can't have everything
there as there are space constraints, and LSN is another 8 bytes and I
still want to have some bytes for storing the origin or whatever you
want to call it there, as that's the one I personally have biggest
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can
pull some tricks to lower that a bit.



The reason why Jim and myself are asking for the LSN and not just the
timestamp is that I want to be able to order the transactions. Jim
pointed out earlier in the thread that just ordering on timestamp allows
for multiple transactions with the same timestamp.

Maybe we don't need the entire LSN to solve that.  If you already have
the commit timestamp maybe you only need another byte or two of
granularity to order transactions that are within the same microsecond.



Hmm maybe just one part of LSN, but I don't really like that either, 
if we want to store LSN we should probably store it as is as somebody 
might want to map it to txid for other reasons.


I did the calculation above wrong btw, it's actually 20 bytes not 24 
bytes per record, I am inclined to just say we can live with that.


Since we agreed that the (B) case is not really feasible and we are 
doing the (C), I also wonder if extradata should be renamed to nodeid 
(even if it's not used at this point as nodeid). And then there is 
question about the size of it, since the nodeid itself can live with 2 
bytes probably (64k of nodes ought to be enough for everybody ;) ).
Or leave the extradata as is but use as reserved space for future use 
and not expose it at this time on SQL level at all?


I guess Andres could answer what suits him better here.



I am happy with renaming extradata to nodeid and not exposing it at this 
time.


If we feel that commit-order (ie LSN or something equivalent) is really 
a different patch/feature than commit-timestamp then I am okay with that 
also but we should make sure to warn users of the commit-timestamp in 
the documentation that two transactions might have the same timestamp 
and that the commit order might not be the same as ordering by the 
commit timestamp.






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-09 Thread Steve Singer

On 11/07/2014 07:07 PM, Petr Jelinek wrote:


The list of what is useful might be long, but we can't have everything 
there as there are space constraints, and LSN is another 8 bytes and I 
still want to have some bytes for storing the origin or whatever you 
want to call it there, as that's the one I personally have biggest 
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can 
pull some tricks to lower that a bit.




The reason why Jim and myself are asking for the LSN and not just the 
timestamp is that I want to be able to order the transactions. Jim 
pointed out earlier in the thread that just ordering on timestamp allows 
for multiple transactions with the same timestamp.


Maybe we don't need the entire LSN to solve that.  If you already have 
the commit timestamp maybe you only need another byte or two of 
granularity to order transactions that are within the same microsecond.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-09 Thread Alvaro Herrera
Robert Haas wrote:

 I think the key question here is the time for which the data needs to
 be retained.  2^32 of anything is a lot, but why keep around that
 number of records rather than more (after all, we have epochs to
 distinguish one use of a given txid from another) or fewer?

The problem is not how much data we retain; is about how much data we
can address.  We must be able to address the data for transaction with
xid=2^32-1, even if you only retain the 1000 most recent transactions.
In fact, we already only retain data back to RecentXmin, if I recall
correctly.  All slru.c users work that way.

Back when pg_multixact/members had the 5-char issue, I came up with a
patch that had each slru.c user declare how many chars maximum were the
filenames.  I didn't push further with that because there was an issue
with it, I don't remember what it was offhand (but I don't think I
posted it).  But this is only needed so that the filenames are all equal
width, which is mostly cosmetical; the rest of the module works fine
with 4- or 5-char filenames, and can be trivially expanded to support 6
or more.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-09 Thread Anssi Kääriäinen
On Sun, 2014-11-09 at 11:57 -0500, Steve Singer wrote:

 The reason why Jim and myself are asking for the LSN and not just the 
 timestamp is that I want to be able to order the transactions. Jim 
 pointed out earlier in the thread that just ordering on timestamp allows 
 for multiple transactions with the same timestamp.
 
 Maybe we don't need the entire LSN to solve that.  If you already have 
 the commit timestamp maybe you only need another byte or two of 
 granularity to order transactions that are within the same microsecond.

There is no guarantee that a commit with later LSN has a later
timestamp. There are cases where the clock could move significantly
backwards.

A robust solution to storing transaction commit information (including
commit order) in a way that can be referenced from other tables, can be
loaded to another cluster, and survives crashes would be a great
feature. But this feature doesn't have those properties.

 - Anssi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-08 Thread Petr Jelinek

On 08/11/14 03:05, Robert Haas wrote:

On Fri, Nov 7, 2014 at 7:07 PM, Petr Jelinek p...@2ndquadrant.com wrote:

but we can't have everything there
as there are space constraints, and LSN is another 8 bytes and I still want
to have some bytes for storing the origin or whatever you want to call it
there, as that's the one I personally have biggest use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can pull some
tricks to lower that a bit.


It won't do to say let's do the things that I want, and foreclose
forever the things that other people want.  I find it quite hard to
believe that 16 bytes per transaction is a perfectly tolerable
overhead but 24 bytes per transaction will break the bank.  But if
that is really true then we ought to reject this patch altogether,
because it's unacceptable, in any arena, for a patch that only
benefits extensions to consume all of the available bit-space in,
leaving none for future core needs.



That's not what I said. I am actually ok with adding the LSN if people 
see it useful.
I was just wondering if we can make the record smaller somehow - 24bytes 
per txid is around 96GB of data for whole txid range and won't work with 
pages smaller than ~4kBs unless we add 6 char support to SLRU (which is 
not hard and we could also not allow track_commit_timestamps to be 
turned on with smaller pagesize...).


I remember somebody was worried about this already during the original 
patch submission and it can't be completely ignored in the discussion 
about adding more stuff into the record.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 5:35 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 That's not what I said. I am actually ok with adding the LSN if people see
 it useful.
 I was just wondering if we can make the record smaller somehow - 24bytes per
 txid is around 96GB of data for whole txid range and won't work with pages
 smaller than ~4kBs unless we add 6 char support to SLRU (which is not hard
 and we could also not allow track_commit_timestamps to be turned on with
 smaller pagesize...).

 I remember somebody was worried about this already during the original patch
 submission and it can't be completely ignored in the discussion about adding
 more stuff into the record.

Fair point.  Sorry I misunderstood.

I think the key question here is the time for which the data needs to
be retained.  2^32 of anything is a lot, but why keep around that
number of records rather than more (after all, we have epochs to
distinguish one use of a given txid from another) or fewer?  Obvious
alternatives include:

- Keep the data for some period of time; discard the data when it's
older than some threshold.
- Keep a certain amount of total data; every time we create a new
file, discard the oldest one.
- Let consumers of the data say how much they need, and throw away
data when it's no longer needed by the oldest consumer.
- Some combination of the above.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Petr Jelinek

On 06/11/14 08:50, Andres Freund wrote:

On 2014-11-05 19:31:52 -0500, Steve Singer wrote:

It also doesn't allow you to load two extensions at once on a system.
You wouldn't be able to have both the 'steve_commit_order' extension
and BDR installed at the same time.  I don't think this patch does a
very good job at (B) but It wasn't intended to.


Well, I don't agree that steve_commit_order makes much sense in this
context. But there actually is a real problem here, namely that there's
no namespacing in those bytes. I'd be ok with saying that we split the
extradata in for bytes for the namespace and four for the actual
data. That's roughly how it works for advisory locks already.



I am not sure how would this solve problem of 2 extensions using the 
extradata given that there can be only 1 record per txid anyway.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Petr Jelinek

On 06/11/14 01:31, Steve Singer wrote:

On 11/05/2014 05:43 PM, Andres Freund wrote:

Is this patch supposed to:

A)  Add commit timestamp tracking but nothing more

B) Add infrastructure to store commit timestamps and provide a facility
for storing additional bits of data extensions might want to be
associated with the commit

C).  Add commit timestamps and node identifiers to commits

If the answer is (A) then I can see why some people are objecting to
adding extradata.If the answer is (B) then it's fair to ask how well
does this patch handle various types of things people might want to
attach to the commit record (such as the LSN).   I think the problem is
that once you start providing a facility extensions can use to store
data along with the commit record then being restricted to 4 or 8 bytes
is very limiting.  It also doesn't allow you to load two extensions at
once on a system.   You wouldn't be able to have both the
'steve_commit_order' extension and BDR installed at the same time.  I
don't think this patch does a very good job at (B) but It wasn't
intended to.



I would love to have (B) but I don't think that's realistic, at least 
not in the extent some people on this thread would like. I mean you can 
already do (B) by using table, it just isn't that great when it comes to 
performance of that solution.


This patch is aimed to do limited version of (B) where you don't have 
dynamic record for storing whatever you might desire but on the upside 
the performance is good. And yes so far this look more like we are 
actually doing (C) since main purpose of the patch is enabling conflict 
detection and resolving of those conflicts, which is useful in many 
replication scenarios that are not limited to the classical multi-master 
setup.



If what we are really doing is C, and just calling the space 'extradata'
until we get the logical identifier stuff in and then we are going to
rename extradata  to nodeid then we should say so.  If we are really
concerned about the pg_upgrade impact of expanding the record later then
maybe we should add 4 bytes of padding to the CommitTimeStampEntry now
and but leave the manipulating the node id until later.



This might not be bad idea. I don't see the extradata being useful for 
multiple extensions at the same time given that there is single record 
per txid unless we would enforce some kind of limitation that extension 
can only set the extradata for txids produced by that extension.


The namespacing idea that Andres has would again work fine for various 
replication solutions as it would make it easier for them to coexist but 
it would still not work for your 'steve_commit_order' (which I also 
think should be done differently anyway).


In general I do see this patch to be similar in purpose to what we did 
with replica triggers or logical decoding, those features also didn't 
really have in-core use, were optional and enabled us to take step 
forward with replication and had some side uses besides replication just 
like commit timestamps do.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Andres Freund
On 2014-11-07 17:54:32 +0100, Petr Jelinek wrote:
 On 06/11/14 08:50, Andres Freund wrote:
 On 2014-11-05 19:31:52 -0500, Steve Singer wrote:
 It also doesn't allow you to load two extensions at once on a system.
 You wouldn't be able to have both the 'steve_commit_order' extension
 and BDR installed at the same time.  I don't think this patch does a
 very good job at (B) but It wasn't intended to.
 
 Well, I don't agree that steve_commit_order makes much sense in this
 context. But there actually is a real problem here, namely that there's
 no namespacing in those bytes. I'd be ok with saying that we split the
 extradata in for bytes for the namespace and four for the actual
 data. That's roughly how it works for advisory locks already.
 
 
 I am not sure how would this solve problem of 2 extensions using the
 extradata given that there can be only 1 record per txid anyway.

It'd help you to detect problems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Robert Haas
 On Nov 5, 2014, at 7:31 PM, Steve Singer st...@ssinger.info wrote:
 On 11/05/2014 05:43 PM, Andres Freund wrote:
 On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
 Imo that's essentially a different feature. What you essentially would
 need here is a 'commit sequence number' - but no timestamps. And
 probably to be useful that number has to be 8 bytes in itself.
 
 I think this gets to the heart of some of the differing views people have 
 expressed on this patch
 
 Is this patch supposed to:
 
 A)  Add commit timestamp tracking but nothing more
 
 B) Add infrastructure to store commit timestamps and provide a facility for 
 storing additional bits of data extensions might want to be associated with 
 the commit
 
 C).  Add commit timestamps and node identifiers to commits

Well put.

I think the authors of this patch are suffering from a certain amount of 
myopia.  Commit timestamps are useful, but so are commit LSNs, and it makes 
little sense to me to suppose that we should have two different systems for 
those closely-related needs.

Like Andres, I think B is impractical, so let's just be honest and admit that C 
is what we're really doing. But let's add LSNs so the people who want that can 
be happy too.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Petr Jelinek

On 08/11/14 00:35, Robert Haas wrote:

On Nov 5, 2014, at 7:31 PM, Steve Singer st...@ssinger.info wrote:

On 11/05/2014 05:43 PM, Andres Freund wrote:
On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
Imo that's essentially a different feature. What you essentially would
need here is a 'commit sequence number' - but no timestamps. And
probably to be useful that number has to be 8 bytes in itself.


I think this gets to the heart of some of the differing views people have 
expressed on this patch

Is this patch supposed to:

A)  Add commit timestamp tracking but nothing more

B) Add infrastructure to store commit timestamps and provide a facility for 
storing additional bits of data extensions might want to be associated with the 
commit

C).  Add commit timestamps and node identifiers to commits


Well put.

I think the authors of this patch are suffering from a certain amount of 
myopia.  Commit timestamps are useful, but so are commit LSNs, and it makes 
little sense to me to suppose that we should have two different systems for 
those closely-related needs.

Like Andres, I think B is impractical, so let's just be honest and admit that C 
is what we're really doing. But let's add LSNs so the people who want that can 
be happy too.



The list of what is useful might be long, but we can't have everything 
there as there are space constraints, and LSN is another 8 bytes and I 
still want to have some bytes for storing the origin or whatever you 
want to call it there, as that's the one I personally have biggest 
use-case for.
So this would be ~24bytes per txid already, hmm I wonder if we can pull 
some tricks to lower that a bit.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-07 Thread Robert Haas
On Fri, Nov 7, 2014 at 7:07 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 The list of what is useful might be long,

That's FUD.  It might also be short.

 but we can't have everything there
 as there are space constraints, and LSN is another 8 bytes and I still want
 to have some bytes for storing the origin or whatever you want to call it
 there, as that's the one I personally have biggest use-case for.
 So this would be ~24bytes per txid already, hmm I wonder if we can pull some
 tricks to lower that a bit.

It won't do to say let's do the things that I want, and foreclose
forever the things that other people want.  I find it quite hard to
believe that 16 bytes per transaction is a perfectly tolerable
overhead but 24 bytes per transaction will break the bank.  But if
that is really true then we ought to reject this patch altogether,
because it's unacceptable, in any arena, for a patch that only
benefits extensions to consume all of the available bit-space in,
leaving none for future core needs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-06 Thread Craig Ringer
On 11/01/2014 09:00 PM, Michael Paquier wrote:
 1) It is untested and actually there is no direct use for it in core.
 2) Pushing code that we know as dead is no good, that's a feature more
 or less defined as maybe-useful-but-we-are-not-sure-yet-what-to-do-with-it.
 3) If you're going to re-use this API in BDR, which is a fork of
 Postgres.

I would like to emphasise that BDR is not really a fork at all, no more
than any big topic branch would be a fork.

BDR is two major parts:

- A collection of patches to core (commit timestamps/extradata, sequence
AM, replication identifiers, logical decoding, DDL deparse, event
triggers, etc). These are being progressively submitted to core.
maintained as multiple feature branches plus a merged version; and

- An extension that uses core features and, where necessary, the
additions to core to implement bi-directional logical replication.

Because of the time scales involved in getting things into core it's
been necessary to *temporarily* get the 9.4-based feature branch into
wider use so that it can be used to run the BDR extension, but if we can
get the required features into core that need will go away.

Event triggers and logical decoding were already merged in 9.4.

If we can get things like commit timestamps, commit extradata / logical
replication identifiers, the sequence access method, etc merged in 9.5
then it should be possible to do away with the need for the patches to
core entirely and run BDR on top of stock 9.5. I'd be delighted if that
were possible, as doing away with the patched 9.4 would get rid of a
great deal of work and frustration on my part.

Note that the BDR extension its self is PostgreSQL-licensed. Externally
maintained extensions have been bought in-core before. It's a lot of
code though, and I can't imagine that being a quick process.



 You'd better complete this API in BDR by yourself and not
 bother core with that.

This argument would've prevented the inclusion of logical decoding,
which is rapidly becoming the headline feature for 9.4, or at least
shortly behind jsonb. Slony is being adapted to use it, multiple people
are working on auditing systems based on it, and AFAIK EDB's xDB is
being adapted to take advantage of it too.

As development gets more complex and people attempt bigger features, the
One Big Patch that adds a feature and an in-core user of the feature is
not going to be a viable approach all the time. In my view it's already
well past that, and some recent patches (like RLS) really should've been
split up into patch series.

If we want to avoid unreviewable monster-patches it will, IMO, be
necessary to have progressive, iterative enhancement. That may sometimes
mean that there's code in core that's only used by future
yet-to-be-merged patches and/or by extensions.

Of course its desirable to have an in-tree user of the code wherever
possible/practical - but sometimes it may *not* be possible or
practical. It seems to me like the benefits of committing work in
smaller, more reviewable chunks outweigh the benefits of merging
multiple related but separate changes just so everything can have an
immediate in-tree user.

That's not to say that extradata must remain glued to commit timestamps.
It might make more sense as a separate patch with an API to allow
extensions to manipulate it directly, plus a dummy extension showing how
it works, like we do with various hooks and with APIs like FDWs.
However, just like the various hooks that we have, it *does* make sense
to have something in-core that has no real world in-core users.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Anssi Kääriäinen
On Tue, 2014-11-04 at 23:43 -0600, Jim Nasby wrote:

 I'm worried about 2 commits in the same microsecond on the same
 system, not on 2 different systems. Or, put another way, if we're
 going to expose this I think it should also provide a guaranteed
 unique commit ordering for a single cluster. Presumably, this
 shouldn't be that hard since we do know the exact order in which
 things committed.

Addition of LSN when the timestamps for two transactions are exactly
same isn't enough. There isn't any guarantee that a later commit gets a
later timestamp than an earlier commit.

In addition, I wonder if this feature would be misused. Record
transaction ids to a table to find out commit order (use case could be
storing historical row versions for example). Do a dump and restore on
another cluster, and all the transaction ids are completely meaningless
to the system.

Having the ability to record commit order into an audit table would be
extremely welcome, but as is, this feature doesn't provide it.

 - Anssi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Michael Paquier
On Wed, Nov 5, 2014 at 5:24 PM, Anssi Kääriäinen anssi.kaariai...@thl.fi
wrote:

 On Tue, 2014-11-04 at 23:43 -0600, Jim Nasby wrote:

  I'm worried about 2 commits in the same microsecond on the same
  system, not on 2 different systems. Or, put another way, if we're
  going to expose this I think it should also provide a guaranteed
  unique commit ordering for a single cluster. Presumably, this
  shouldn't be that hard since we do know the exact order in which
  things committed.

 Addition of LSN when the timestamps for two transactions are exactly
 same isn't enough. There isn't any guarantee that a later commit gets a
 later timestamp than an earlier commit.

True if WAL record ID is not globally consistent. Two-level commit ordering
can be done with (timestamp or LSN, nodeID). At equal timestamp, we could
say as well that the node with the lowest systemID wins for example. That's
not something


 In addition, I wonder if this feature would be misused. Record
 transaction ids to a table to find out commit order (use case could be
 storing historical row versions for example). Do a dump and restore on
 another cluster, and all the transaction ids are completely meaningless
 to the system.

I think you are forgetting the fact to be able to take a consistent dump
using an exported snapshot. In this case the commit order may not be that
meaningless..


 Having the ability to record commit order into an audit table would be
 extremely welcome, but as is, this feature doesn't provide it.

That's something that can actually be achieved with this feature if the SQL
interface is able to query all the timestamps in a xid range with for
example a background worker that tracks this data periodically. Now the
thing is as well: how much timestamp history do we want to keep? The patch
truncating SLRU files with frozenID may cover a sufficient range...
-- 
Michael


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Jim Nasby



On 11/5/14, 6:10 AM, Michael Paquier wrote:

In addition, I wonder if this feature would be misused. Record
transaction ids to a table to find out commit order (use case could be
storing historical row versions for example). Do a dump and restore on
another cluster, and all the transaction ids are completely meaningless
to the system.

I think you are forgetting the fact to be able to take a consistent dump using 
an exported snapshot. In this case the commit order may not be that 
meaningless..


Anssi's point is that you can't use xmin because it can change, but I think 
anyone working with this feature would understand that.


Having the ability to record commit order into an audit table would be
extremely welcome, but as is, this feature doesn't provide it.

That's something that can actually be achieved with this feature if the SQL 
interface is able to query all the timestamps in a xid range with for example a 
background worker that tracks this data periodically. Now the thing is as well: 
how much timestamp history do we want to keep? The patch truncating SLRU files 
with frozenID may cover a sufficient range...


Except that commit time is not guaranteed unique *even on a single system*. 
That's my whole point. If we're going to bother with all the commit time 
machinery it seems really silly to provide a way to uniquely order every commit.

Clearly trying to uniquely order commits across multiple systems is a far 
larger problem, and I'm not suggesting we attempt that. But for a single system 
AIUI all we need to do is expose the LSN of each commit record and that will 
give you the exact and unique order in which transactions committed.

This isn't a hypothetical feature either; if we had this, logical replication 
systems wouldn't have to try and fake this via batches. You could actually 
recreate exactly what data was visible at what time to all transactions, not 
just repeatable read ones (as long as you kept snapshot data as well, which 
isn't hard).

As for how much data to keep, if you have a process that's doing something to 
record this information permanently all it needs to do is keep an old enough 
snapshot around. That's not that hard to do, even from user space.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Jim Nasby

On 11/5/14, 10:30 AM, Andres Freund wrote:

Except that commit time is not guaranteed unique *even on a single
system*. That's my whole point. If we're going to bother with all the
commit time machinery it seems really silly to provide a way to
uniquely order every commit.

Well. I think that's the misunderstanding here. That's absolutely not
what committs is supposed to be used for. For the replication stream
you'd hopefully use logical decoding. That gives you the transaction
data exactly in commit order.


So presumably you'd want to use logical decoding to insert into a table with a 
sequence on it, or similar?

I agree, that sounds like a better way to handle this. I think it's worth 
mentioning in the docs for commit_ts, because people WILL mistakenly try and 
use it to determine commit ordering.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Andres Freund
On 2014-11-05 10:23:15 -0600, Jim Nasby wrote:
 
 
 On 11/5/14, 6:10 AM, Michael Paquier wrote:
 In addition, I wonder if this feature would be misused. Record
 transaction ids to a table to find out commit order (use case could be
 storing historical row versions for example). Do a dump and restore on
 another cluster, and all the transaction ids are completely meaningless
 to the system.
 
 I think you are forgetting the fact to be able to take a consistent dump 
 using an exported snapshot. In this case the commit order may not be that 
 meaningless..
 
 Anssi's point is that you can't use xmin because it can change, but I think 
 anyone working with this feature would understand that.

 Having the ability to record commit order into an audit table would be
 extremely welcome, but as is, this feature doesn't provide it.
 
 That's something that can actually be achieved with this feature if
 the SQL interface is able to query all the timestamps in a xid range
 with for example a background worker that tracks this data
 periodically. Now the thing is as well: how much timestamp history do
 we want to keep? The patch truncating SLRU files with frozenID may
 cover a sufficient range...


 Except that commit time is not guaranteed unique *even on a single
 system*. That's my whole point. If we're going to bother with all the
 commit time machinery it seems really silly to provide a way to
 uniquely order every commit.

Well. I think that's the misunderstanding here. That's absolutely not
what committs is supposed to be used for. For the replication stream
you'd hopefully use logical decoding. That gives you the transaction
data exactly in commit order.

 Clearly trying to uniquely order commits across multiple systems is a
 far larger problem, and I'm not suggesting we attempt that. But for a
 single system AIUI all we need to do is expose the LSN of each commit
 record and that will give you the exact and unique order in which
 transactions committed.

I don't think that's something you should attempt. That's what logical
decoding is for. Hence I see little point in exposing the LSN that way.

Where I think committs is useful is a method for analyzing and resolving
conflicts between multiple systems. In that case you likely can't use
the LSN for anything as it won't be very meaningful. If you get
conflicts below the accuracy of the timestamps you better use another
deterministic method of resolving them - BDR e.g. compares the system
identifier, timeline id, database oid, and a user defined name. While
enforcing that those aren't the same between systems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Andres Freund
On 2014-11-05 10:34:40 -0600, Jim Nasby wrote:
 On 11/5/14, 10:30 AM, Andres Freund wrote:
 Except that commit time is not guaranteed unique *even on a single
 system*. That's my whole point. If we're going to bother with all the
 commit time machinery it seems really silly to provide a way to
 uniquely order every commit.

 Well. I think that's the misunderstanding here. That's absolutely not
 what committs is supposed to be used for. For the replication stream
 you'd hopefully use logical decoding. That gives you the transaction
 data exactly in commit order.
 
 So presumably you'd want to use logical decoding to insert into a
 table with a sequence on it, or similar?

I'm not following. I'd use logical decoding to replicate the data to
another system, thereby guaranteeing its done in commit order. Then,
when applying the data on the other side, I can detect/resolve some
forms of conflicts by looking at the timestamps of rows via committs.

 I agree, that sounds like a better way to handle this. I think it's
 worth mentioning in the docs for commit_ts, because people WILL
 mistakenly try and use it to determine commit ordering.

Ok, sounds sensible.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Kevin Grittner
Jim Nasby jim.na...@bluetreble.com wrote:

 for a single system AIUI all we need to do is expose the LSN of
 each commit record and that will give you the exact and unique
 order in which transactions committed.

 This isn't a hypothetical feature either; if we had this,
 logical replication systems wouldn't have to try and fake this
 via batches. You could actually recreate exactly what data was
 visible at what time to all transactions, not just repeatable
 read ones (as long as you kept snapshot data as well, which isn't
 hard).

Well, that not entirely true for serializable transactions; there
are points in time where reading the committed state could cause a
transaction to roll back[1] -- either a writing transaction which
would make that visible state inconsistent with the later committed
state or the reading transaction if it views something which is not
(yet) consistent.

That's not to say that this feature is a bad idea; part of the
serializable implementation itself depends on being able to
accurately determine commit order, and this feature could allow
that to work more efficiently.  I'm saying that, like hot standby,
a replicated database could not provide truly serializable
transactions (even read only ones) without something else in
addition to this.  We've discussed various ways of doing that.
Perhaps the most promising is to include in the stream some
indication of which points in the transaction stream are safe for a
serializable transaction to read.  If there's a way to implement
commit order recording such that a two-state flag could be
associated with each commit, I think it could be made to work for
serializable transactions.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] https://wiki.postgresql.org/wiki/SSI#Read_Only_Transactions


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Steve Singer

On 11/05/2014 11:23 AM, Jim Nasby wrote:



Except that commit time is not guaranteed unique *even on a single 
system*. That's my whole point. If we're going to bother with all the 
commit time machinery it seems really silly to provide a way to 
uniquely order every commit.


Clearly trying to uniquely order commits across multiple systems is a 
far larger problem, and I'm not suggesting we attempt that. But for a 
single system AIUI all we need to do is expose the LSN of each commit 
record and that will give you the exact and unique order in which 
transactions committed.


This isn't a hypothetical feature either; if we had this, logical 
replication systems wouldn't have to try and fake this via batches. 
You could actually recreate exactly what data was visible at what time 
to all transactions, not just repeatable read ones (as long as you 
kept snapshot data as well, which isn't hard).


As for how much data to keep, if you have a process that's doing 
something to record this information permanently all it needs to do is 
keep an old enough snapshot around. That's not that hard to do, even 
from user space.


+1 for this.

It isn't just 'replication' systems that have a need for getting the 
commit order of transactions on a single system.  I have a application 
(not slony) where we want to query a table but order the output based on 
the transaction commit order of when the insert into the table was done 
(think of a queue). I'm not replicating the output but passing the data 
to other applications for further processing.  If I just had the commit 
timestamp I would need to put in some other condition to break ties in a 
consistent way.  I think being able to get an ordering by commit LSN is 
what I really want in this case not the timestamp.


Logical decoding is one solution to this (that I was considering) but 
being able to do something like

select * FROM event_log order by commit_id would be a lot simpler.






--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Andres Freund
On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
 It isn't just 'replication' systems that have a need for getting the commit
 order of transactions on a single system.  I have a application (not slony)
 where we want to query a table but order the output based on the transaction
 commit order of when the insert into the table was done (think of a queue).
 I'm not replicating the output but passing the data to other applications
 for further processing.  If I just had the commit timestamp I would need to
 put in some other condition to break ties in a consistent way.  I think
 being able to get an ordering by commit LSN is what I really want in this
 case not the timestamp.
 
 Logical decoding is one solution to this (that I was considering) but being
 able to do something like
 select * FROM event_log order by commit_id would be a lot simpler.

Imo that's essentially a different feature. What you essentially would
need here is a 'commit sequence number' - but no timestamps. And
probably to be useful that number has to be 8 bytes in itself.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Steve Singer

On 11/05/2014 05:43 PM, Andres Freund wrote:

On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
Imo that's essentially a different feature. What you essentially would
need here is a 'commit sequence number' - but no timestamps. And
probably to be useful that number has to be 8 bytes in itself.


I think this gets to the heart of some of the differing views people 
have expressed on this patch


Is this patch supposed to:

A)  Add commit timestamp tracking but nothing more

B) Add infrastructure to store commit timestamps and provide a facility 
for storing additional bits of data extensions might want to be 
associated with the commit


C).  Add commit timestamps and node identifiers to commits

If the answer is (A) then I can see why some people are objecting to 
adding extradata.If the answer is (B) then it's fair to ask how well 
does this patch handle various types of things people might want to 
attach to the commit record (such as the LSN).   I think the problem is 
that once you start providing a facility extensions can use to store 
data along with the commit record then being restricted to 4 or 8 bytes 
is very limiting.  It also doesn't allow you to load two extensions at 
once on a system.   You wouldn't be able to have both the  
'steve_commit_order' extension and BDR installed at the same time.  I 
don't think this patch does a very good job at (B) but It wasn't 
intended to.


If what we are really doing is C, and just calling the space 'extradata' 
until we get the logical identifier stuff in and then we are going to 
rename extradata  to nodeid then we should say so.  If we are really 
concerned about the pg_upgrade impact of expanding the record later then 
maybe we should add 4 bytes of padding to the CommitTimeStampEntry now 
and but leave the manipulating the node id until later.


Steve





Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-05 Thread Andres Freund
On 2014-11-05 19:31:52 -0500, Steve Singer wrote:
 On 11/05/2014 05:43 PM, Andres Freund wrote:
 On 2014-11-05 17:17:05 -0500, Steve Singer wrote:
 Imo that's essentially a different feature. What you essentially would
 need here is a 'commit sequence number' - but no timestamps. And
 probably to be useful that number has to be 8 bytes in itself.
 
 I think this gets to the heart of some of the differing views people have
 expressed on this patch

I think it's actually besides the heart...

 Is this patch supposed to:
 
 A)  Add commit timestamp tracking but nothing more
 
 B) Add infrastructure to store commit timestamps and provide a facility for
 storing additional bits of data extensions might want to be associated with
 the commit
 
 C).  Add commit timestamps and node identifiers to commits

 If the answer is (A) then I can see why some people are objecting to adding
 extradata.If the answer is (B) then it's fair to ask how well does this
 patch handle various types of things people might want to attach to the
 commit record (such as the LSN).

I think there's a mistake exactly here. The LSN of the commit isn't just
some extra information about the commit. You can't say 'here, also
attach this piece of information'. Instead you need special case code in
xact.c to add it. Thus prohibiting that space to be used for something
else.

 I think the problem is that once you
 start providing a facility extensions can use to store data along with the
 commit record then being restricted to 4 or 8 bytes is very limiting.

Well, you can easily use those 4/8 bytes to start adding more data to
the transaction. By referring to some table with transaction metadata
for example.

 It also doesn't allow you to load two extensions at once on a system.
 You wouldn't be able to have both the 'steve_commit_order' extension
 and BDR installed at the same time.  I don't think this patch does a
 very good job at (B) but It wasn't intended to.

Well, I don't agree that steve_commit_order makes much sense in this
context. But there actually is a real problem here, namely that there's
no namespacing in those bytes. I'd be ok with saying that we split the
extradata in for bytes for the namespace and four for the actual
data. That's roughly how it works for advisory locks already.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-04 Thread Andres Freund
On 2014-11-02 19:27:25 +0100, Petr Jelinek wrote:
 Well, Michael has point that the extradata is pretty much useless currently,
 perhaps it would help to add the interface to set extradata?

Only accessible via C and useless aren't the same thing. But sure, add
it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >