Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-13 Thread Robert Haas
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane  wrote:
>>> ... well, they would be if we passed down xactStartTimestamp to parallel
>>> workers, but I can't find any code that does that.  In view of the fact that
>>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6.
>
>> Yeah.  Do you think we should arrange to pass that down, or change the 
>> marking?
>
> We can't fix the marking in existing 9.6 installations, so I think we
> have to pass it down.  (Which would be a better response anyway.)

I happened across this thread today and took a look at what it would
take to fix this.  I quickly ran up against the fact that
SerializeTransactionState() and RestoreTransactionState() are not
exactly brilliantly designed, relying on the notion that each
individual value that we want to serialize will be no wider than a
TransactionId, which won't be true for timestamps.  Even apart from
that, the whole design of those functions is pretty lame, and I'm
pretty sure I wrote all of that code myself, so I have nobody to blame
but me.  Anyway, here's a proposed patch to refactor that code into
something a little more reasonable.  It doesn't fix the actual problem
here, but I think it's a good first step.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index d643216..9c13717 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -65,6 +65,23 @@
 #include "utils/timestamp.h"
 #include "pg_trace.h"
 
+/*
+ * Serialization format for transaction state information.
+ */
+typedef struct SerializedTransactionState
+{
+	int			xactIsoLevel;
+	bool		xactDeferrable;
+	TransactionId xactTopTransactionId;
+	TransactionId currentTransactionId;
+	CommandId	currentCommandId;
+	int			nCurrentXids;
+	TransactionId currentXids[FLEXIBLE_ARRAY_MEMBER];
+}	SerializedTransactionState;
+
+#define SIZE_OF_SERIALIZED_TRANSACTION_STATE(n_current_xids) \
+	offsetof(SerializedTransactionState, currentXids) + \
+	(n_current_xids * sizeof(TransactionId))
 
 /*
  *	User-tweakable parameters
@@ -4812,8 +4829,7 @@ Size
 EstimateTransactionStateSpace(void)
 {
 	TransactionState s;
-	Size		nxids = 6;		/* iso level, deferrable, top & current XID,
- * command counter, XID count */
+	Size		nxids = 0;
 
 	for (s = CurrentTransactionState; s != NULL; s = s->parent)
 	{
@@ -4823,7 +4839,7 @@ EstimateTransactionStateSpace(void)
 	}
 
 	nxids = add_size(nxids, nParallelCurrentXids);
-	return mul_size(nxids, sizeof(TransactionId));
+	return SIZE_OF_SERIALIZED_TRANSACTION_STATE(nxids);
 }
 
 /*
@@ -4847,16 +4863,17 @@ SerializeTransactionState(Size maxsize, char *start_address)
 	TransactionState s;
 	Size		nxids = 0;
 	Size		i = 0;
-	Size		c = 0;
 	TransactionId *workspace;
-	TransactionId *result = (TransactionId *) start_address;
+	SerializedTransactionState *result;
+
+	result = (SerializedTransactionState *) start_address;
 
-	result[c++] = (TransactionId) XactIsoLevel;
-	result[c++] = (TransactionId) XactDeferrable;
-	result[c++] = XactTopTransactionId;
-	result[c++] = CurrentTransactionState->transactionId;
-	result[c++] = (TransactionId) currentCommandId;
-	Assert(maxsize >= c * sizeof(TransactionId));
+	Assert(maxsize >= SIZE_OF_SERIALIZED_TRANSACTION_STATE(0));
+	result->xactIsoLevel = XactIsoLevel;
+	result->xactDeferrable = XactDeferrable;
+	result->xactTopTransactionId = XactTopTransactionId;
+	result->currentTransactionId = CurrentTransactionState->transactionId;
+	result->currentCommandId = currentCommandId;
 
 	/*
 	 * If we're running in a parallel worker and launching a parallel worker
@@ -4865,9 +4882,10 @@ SerializeTransactionState(Size maxsize, char *start_address)
 	 */
 	if (nParallelCurrentXids > 0)
 	{
-		result[c++] = nParallelCurrentXids;
-		Assert(maxsize >= (nParallelCurrentXids + c) * sizeof(TransactionId));
-		memcpy(&result[c], ParallelCurrentXids,
+		result->nCurrentXids = nParallelCurrentXids;
+		Assert(maxsize >=
+			   SIZE_OF_SERIALIZED_TRANSACTION_STATE(nParallelCurrentXids));
+		memcpy(&result->currentXids, ParallelCurrentXids,
 			   nParallelCurrentXids * sizeof(TransactionId));
 		return;
 	}
@@ -4882,7 +4900,7 @@ SerializeTransactionState(Size maxsize, char *start_address)
 			nxids = add_size(nxids, 1);
 		nxids = add_size(nxids, s->nChildXids);
 	}
-	Assert((c + 1 + nxids) * sizeof(TransactionId) <= maxsize);
+	Assert(SIZE_OF_SERIALIZED_TRANSACTION_STATE(nxids) <= maxsize);
 
 	/* Copy them to our scratch space. */
 	workspace = palloc(nxids * sizeof(TransactionId));
@@ -4900,8 +4918,8 @@ SerializeTransactionState(Size maxsize, char *start_address)
 	qsort(workspace, nxids, sizeof(TransactionId), xidComparator);
 
 	/* Copy data into output area. */
-	result[c++] = (TransactionId) nxids;
-	memcpy(&result[c], workspace, nxids * sizeof(TransactionId));
+	result->nCur

Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Tom Lane
I wrote:
> Yeah, I didn't have any doubt that it was real.  Still don't know
> why my test case isn't doing what I expected, though.

Doh: the planner knows that transaction_timestamp() is stable, so
it concludes that the DISTINCT condition is vacuous.  There is a
"Unique" node in the plan, but it has zero columns to compare, so
it thinks the tuple are all equivalent and emits only the first.

I had noticed that there was no "Sort" node, but failed to realize
that that implied the "Unique" node was degenerate.

Maybe this is over-optimization, but I think we'd be very sad if
the planner didn't do it; getting rid of useless sort columns is
critical in a lot of situations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Tom Lane
Robert Haas  writes:
> On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane  wrote:
>> but it doesn't:
>> 
>> regression=# select distinct transaction_timestamp() from tenk1;
>> transaction_timestamp
>> ---
>> 2016-12-01 15:44:12.839417-05
>> (1 row)
>> 
>> How is that happening?

> Because the table is so small, the leader probably finishes running
> the whole plan before the workers finish starting up.

Good try, but EXPLAIN ANALYZE says that the workers are processing
some of the rows.  Also, I see the same behavior with a much larger
test table.

> You can see the problem like this, though:

Yeah, I didn't have any doubt that it was real.  Still don't know
why my test case isn't doing what I expected, though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Robert Haas
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane  wrote:
>>> ... well, they would be if we passed down xactStartTimestamp to parallel
>>> workers, but I can't find any code that does that.  In view of the fact that
>>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6.
>
>> Yeah.  Do you think we should arrange to pass that down, or change the 
>> marking?
>
> We can't fix the marking in existing 9.6 installations, so I think we
> have to pass it down.  (Which would be a better response anyway.)
>
> Having said that, I find myself unable to reproduce a problem.
> This should fail:
>
> regression=# set parallel_setup_cost TO 0;
> SET
> regression=# set parallel_tuple_cost TO 0;
> SET
> regression=# set min_parallel_relation_size TO 0;
> SET
> regression=# set enable_indexscan TO 0;
> SET
> regression=# explain verbose select distinct transaction_timestamp() from 
> tenk1;
>   QUERY PLAN
> --
>  Unique  (cost=0.00..424.67 rows=1 width=8)
>Output: (transaction_timestamp())
>->  Gather  (cost=0.00..424.67 rows=1 width=8)
>  Output: (transaction_timestamp())
>  Workers Planned: 2
>  ->  Parallel Seq Scan on public.tenk1  (cost=0.00..410.08 rows=4167 
> width=8)
>Output: transaction_timestamp()
> (7 rows)
>
> but it doesn't:
>
> regression=# select distinct transaction_timestamp() from tenk1;
>  transaction_timestamp
> ---
>  2016-12-01 15:44:12.839417-05
> (1 row)
>
> How is that happening?

Because the table is so small, the leader probably finishes running
the whole plan before the workers finish starting up.

You can see the problem like this, though:

rhaas=# begin;
BEGIN
rhaas=# select transaction_timestamp();
 transaction_timestamp
---
 2016-12-01 15:51:14.443116-05
(1 row)

rhaas=# select transaction_timestamp();
 transaction_timestamp
---
 2016-12-01 15:51:14.443116-05
(1 row)

rhaas=# select transaction_timestamp();
 transaction_timestamp
---
 2016-12-01 15:51:14.443116-05
(1 row)

rhaas=# set force_parallel_mode = true;
SET
rhaas=# select transaction_timestamp();
 transaction_timestamp
---
 2016-12-01 15:51:26.603302-05
(1 row)

rhaas=# select transaction_timestamp();
 transaction_timestamp
---
 2016-12-01 15:51:27.316032-05
(1 row)

force_parallel_mode causes the whole plan to be run by the worker,
without any participation by the leader.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Tom Lane
Robert Haas  writes:
> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane  wrote:
>> ... well, they would be if we passed down xactStartTimestamp to parallel
>> workers, but I can't find any code that does that.  In view of the fact that
>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6.

> Yeah.  Do you think we should arrange to pass that down, or change the 
> marking?

We can't fix the marking in existing 9.6 installations, so I think we
have to pass it down.  (Which would be a better response anyway.)

Having said that, I find myself unable to reproduce a problem.
This should fail:

regression=# set parallel_setup_cost TO 0;
SET
regression=# set parallel_tuple_cost TO 0;
SET
regression=# set min_parallel_relation_size TO 0;
SET
regression=# set enable_indexscan TO 0;
SET
regression=# explain verbose select distinct transaction_timestamp() from tenk1;
  QUERY PLAN
  
--
 Unique  (cost=0.00..424.67 rows=1 width=8)
   Output: (transaction_timestamp())
   ->  Gather  (cost=0.00..424.67 rows=1 width=8)
 Output: (transaction_timestamp())
 Workers Planned: 2
 ->  Parallel Seq Scan on public.tenk1  (cost=0.00..410.08 rows=4167 
width=8)
   Output: transaction_timestamp()
(7 rows)

but it doesn't:

regression=# select distinct transaction_timestamp() from tenk1;
 transaction_timestamp 
---
 2016-12-01 15:44:12.839417-05
(1 row)

How is that happening?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Robert Haas
On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane  wrote:
> I wrote:
>> <5bih4k+4jfl6m39j...@guerrillamail.com> writes:
>>> pg_proc shows that now() is marked as restricted, but 
>>> transaction_timestamp() is marked as safe.
>
>> That's certainly silly, because they're equivalent.  I should think
>> they're both safe.  Robert?
>
> ... well, they would be if we passed down xactStartTimestamp to parallel
> workers, but I can't find any code that does that.  In view of the fact that
> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6.

Yeah.  Do you think we should arrange to pass that down, or change the marking?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Tom Lane
I wrote:
> <5bih4k+4jfl6m39j...@guerrillamail.com> writes:
>> pg_proc shows that now() is marked as restricted, but 
>> transaction_timestamp() is marked as safe.

> That's certainly silly, because they're equivalent.  I should think
> they're both safe.  Robert?

... well, they would be if we passed down xactStartTimestamp to parallel
workers, but I can't find any code that does that.  In view of the fact that
transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel safety of CURRENT_* family

2016-12-01 Thread Tom Lane
<5bih4k+4jfl6m39j...@guerrillamail.com> writes:
> How should I mark a function which calls CURRENT_DATE? Parallel safe or 
> parallel restricted?

> pg_proc shows that now() is marked as restricted, but transaction_timestamp() 
> is marked as safe.

That's certainly silly, because they're equivalent.  I should think
they're both safe.  Robert?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers