Re: [HACKERS] Parallel safety of CURRENT_* family
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane wrote: >>> ... well, they would be if we passed down xactStartTimestamp to parallel >>> workers, but I can't find any code that does that. In view of the fact that >>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > >> Yeah. Do you think we should arrange to pass that down, or change the >> marking? > > We can't fix the marking in existing 9.6 installations, so I think we > have to pass it down. (Which would be a better response anyway.) I happened across this thread today and took a look at what it would take to fix this. I quickly ran up against the fact that SerializeTransactionState() and RestoreTransactionState() are not exactly brilliantly designed, relying on the notion that each individual value that we want to serialize will be no wider than a TransactionId, which won't be true for timestamps. Even apart from that, the whole design of those functions is pretty lame, and I'm pretty sure I wrote all of that code myself, so I have nobody to blame but me. Anyway, here's a proposed patch to refactor that code into something a little more reasonable. It doesn't fix the actual problem here, but I think it's a good first step. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index d643216..9c13717 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -65,6 +65,23 @@ #include "utils/timestamp.h" #include "pg_trace.h" +/* + * Serialization format for transaction state information. + */ +typedef struct SerializedTransactionState +{ + int xactIsoLevel; + bool xactDeferrable; + TransactionId xactTopTransactionId; + TransactionId currentTransactionId; + CommandId currentCommandId; + int nCurrentXids; + TransactionId currentXids[FLEXIBLE_ARRAY_MEMBER]; +} SerializedTransactionState; + +#define SIZE_OF_SERIALIZED_TRANSACTION_STATE(n_current_xids) \ + offsetof(SerializedTransactionState, currentXids) + \ + (n_current_xids * sizeof(TransactionId)) /* * User-tweakable parameters @@ -4812,8 +4829,7 @@ Size EstimateTransactionStateSpace(void) { TransactionState s; - Size nxids = 6; /* iso level, deferrable, top & current XID, - * command counter, XID count */ + Size nxids = 0; for (s = CurrentTransactionState; s != NULL; s = s->parent) { @@ -4823,7 +4839,7 @@ EstimateTransactionStateSpace(void) } nxids = add_size(nxids, nParallelCurrentXids); - return mul_size(nxids, sizeof(TransactionId)); + return SIZE_OF_SERIALIZED_TRANSACTION_STATE(nxids); } /* @@ -4847,16 +4863,17 @@ SerializeTransactionState(Size maxsize, char *start_address) TransactionState s; Size nxids = 0; Size i = 0; - Size c = 0; TransactionId *workspace; - TransactionId *result = (TransactionId *) start_address; + SerializedTransactionState *result; + + result = (SerializedTransactionState *) start_address; - result[c++] = (TransactionId) XactIsoLevel; - result[c++] = (TransactionId) XactDeferrable; - result[c++] = XactTopTransactionId; - result[c++] = CurrentTransactionState->transactionId; - result[c++] = (TransactionId) currentCommandId; - Assert(maxsize >= c * sizeof(TransactionId)); + Assert(maxsize >= SIZE_OF_SERIALIZED_TRANSACTION_STATE(0)); + result->xactIsoLevel = XactIsoLevel; + result->xactDeferrable = XactDeferrable; + result->xactTopTransactionId = XactTopTransactionId; + result->currentTransactionId = CurrentTransactionState->transactionId; + result->currentCommandId = currentCommandId; /* * If we're running in a parallel worker and launching a parallel worker @@ -4865,9 +4882,10 @@ SerializeTransactionState(Size maxsize, char *start_address) */ if (nParallelCurrentXids > 0) { - result[c++] = nParallelCurrentXids; - Assert(maxsize >= (nParallelCurrentXids + c) * sizeof(TransactionId)); - memcpy(&result[c], ParallelCurrentXids, + result->nCurrentXids = nParallelCurrentXids; + Assert(maxsize >= + SIZE_OF_SERIALIZED_TRANSACTION_STATE(nParallelCurrentXids)); + memcpy(&result->currentXids, ParallelCurrentXids, nParallelCurrentXids * sizeof(TransactionId)); return; } @@ -4882,7 +4900,7 @@ SerializeTransactionState(Size maxsize, char *start_address) nxids = add_size(nxids, 1); nxids = add_size(nxids, s->nChildXids); } - Assert((c + 1 + nxids) * sizeof(TransactionId) <= maxsize); + Assert(SIZE_OF_SERIALIZED_TRANSACTION_STATE(nxids) <= maxsize); /* Copy them to our scratch space. */ workspace = palloc(nxids * sizeof(TransactionId)); @@ -4900,8 +4918,8 @@ SerializeTransactionState(Size maxsize, char *start_address) qsort(workspace, nxids, sizeof(TransactionId), xidComparator); /* Copy data into output area. */ - result[c++] = (TransactionId) nxids; - memcpy(&result[c], workspace, nxids * sizeof(TransactionId)); + result->nCur
Re: [HACKERS] Parallel safety of CURRENT_* family
I wrote: > Yeah, I didn't have any doubt that it was real. Still don't know > why my test case isn't doing what I expected, though. Doh: the planner knows that transaction_timestamp() is stable, so it concludes that the DISTINCT condition is vacuous. There is a "Unique" node in the plan, but it has zero columns to compare, so it thinks the tuple are all equivalent and emits only the first. I had noticed that there was no "Sort" node, but failed to realize that that implied the "Unique" node was degenerate. Maybe this is over-optimization, but I think we'd be very sad if the planner didn't do it; getting rid of useless sort columns is critical in a lot of situations. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
Robert Haas writes: > On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane wrote: >> but it doesn't: >> >> regression=# select distinct transaction_timestamp() from tenk1; >> transaction_timestamp >> --- >> 2016-12-01 15:44:12.839417-05 >> (1 row) >> >> How is that happening? > Because the table is so small, the leader probably finishes running > the whole plan before the workers finish starting up. Good try, but EXPLAIN ANALYZE says that the workers are processing some of the rows. Also, I see the same behavior with a much larger test table. > You can see the problem like this, though: Yeah, I didn't have any doubt that it was real. Still don't know why my test case isn't doing what I expected, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
On Thu, Dec 1, 2016 at 3:46 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane wrote: >>> ... well, they would be if we passed down xactStartTimestamp to parallel >>> workers, but I can't find any code that does that. In view of the fact that >>> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > >> Yeah. Do you think we should arrange to pass that down, or change the >> marking? > > We can't fix the marking in existing 9.6 installations, so I think we > have to pass it down. (Which would be a better response anyway.) > > Having said that, I find myself unable to reproduce a problem. > This should fail: > > regression=# set parallel_setup_cost TO 0; > SET > regression=# set parallel_tuple_cost TO 0; > SET > regression=# set min_parallel_relation_size TO 0; > SET > regression=# set enable_indexscan TO 0; > SET > regression=# explain verbose select distinct transaction_timestamp() from > tenk1; > QUERY PLAN > -- > Unique (cost=0.00..424.67 rows=1 width=8) >Output: (transaction_timestamp()) >-> Gather (cost=0.00..424.67 rows=1 width=8) > Output: (transaction_timestamp()) > Workers Planned: 2 > -> Parallel Seq Scan on public.tenk1 (cost=0.00..410.08 rows=4167 > width=8) >Output: transaction_timestamp() > (7 rows) > > but it doesn't: > > regression=# select distinct transaction_timestamp() from tenk1; > transaction_timestamp > --- > 2016-12-01 15:44:12.839417-05 > (1 row) > > How is that happening? Because the table is so small, the leader probably finishes running the whole plan before the workers finish starting up. You can see the problem like this, though: rhaas=# begin; BEGIN rhaas=# select transaction_timestamp(); transaction_timestamp --- 2016-12-01 15:51:14.443116-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp --- 2016-12-01 15:51:14.443116-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp --- 2016-12-01 15:51:14.443116-05 (1 row) rhaas=# set force_parallel_mode = true; SET rhaas=# select transaction_timestamp(); transaction_timestamp --- 2016-12-01 15:51:26.603302-05 (1 row) rhaas=# select transaction_timestamp(); transaction_timestamp --- 2016-12-01 15:51:27.316032-05 (1 row) force_parallel_mode causes the whole plan to be run by the worker, without any participation by the leader. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
Robert Haas writes: > On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane wrote: >> ... well, they would be if we passed down xactStartTimestamp to parallel >> workers, but I can't find any code that does that. In view of the fact that >> transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. > Yeah. Do you think we should arrange to pass that down, or change the > marking? We can't fix the marking in existing 9.6 installations, so I think we have to pass it down. (Which would be a better response anyway.) Having said that, I find myself unable to reproduce a problem. This should fail: regression=# set parallel_setup_cost TO 0; SET regression=# set parallel_tuple_cost TO 0; SET regression=# set min_parallel_relation_size TO 0; SET regression=# set enable_indexscan TO 0; SET regression=# explain verbose select distinct transaction_timestamp() from tenk1; QUERY PLAN -- Unique (cost=0.00..424.67 rows=1 width=8) Output: (transaction_timestamp()) -> Gather (cost=0.00..424.67 rows=1 width=8) Output: (transaction_timestamp()) Workers Planned: 2 -> Parallel Seq Scan on public.tenk1 (cost=0.00..410.08 rows=4167 width=8) Output: transaction_timestamp() (7 rows) but it doesn't: regression=# select distinct transaction_timestamp() from tenk1; transaction_timestamp --- 2016-12-01 15:44:12.839417-05 (1 row) How is that happening? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
On Thu, Dec 1, 2016 at 2:32 PM, Tom Lane wrote: > I wrote: >> <5bih4k+4jfl6m39j...@guerrillamail.com> writes: >>> pg_proc shows that now() is marked as restricted, but >>> transaction_timestamp() is marked as safe. > >> That's certainly silly, because they're equivalent. I should think >> they're both safe. Robert? > > ... well, they would be if we passed down xactStartTimestamp to parallel > workers, but I can't find any code that does that. In view of the fact that > transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. Yeah. Do you think we should arrange to pass that down, or change the marking? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
I wrote: > <5bih4k+4jfl6m39j...@guerrillamail.com> writes: >> pg_proc shows that now() is marked as restricted, but >> transaction_timestamp() is marked as safe. > That's certainly silly, because they're equivalent. I should think > they're both safe. Robert? ... well, they would be if we passed down xactStartTimestamp to parallel workers, but I can't find any code that does that. In view of the fact that transaction_timestamp() is marked as parallel-safe, this is a bug in 9.6. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel safety of CURRENT_* family
<5bih4k+4jfl6m39j...@guerrillamail.com> writes: > How should I mark a function which calls CURRENT_DATE? Parallel safe or > parallel restricted? > pg_proc shows that now() is marked as restricted, but transaction_timestamp() > is marked as safe. That's certainly silly, because they're equivalent. I should think they're both safe. Robert? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers