Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-29 Thread Heikki Linnakangas

On 23.02.2012 01:36, Jeff Davis wrote:

On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote:

On Tue, Feb 14, 2012 at 09:27:58AM -0600, Kevin Grittner wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  wrote:

On 14.02.2012 04:57, Dan Ports wrote:

The easiest answer would be to just treat every prepared
transaction found during recovery as though it had a conflict in
and out. This is roughly a one-line change, and it's certainly
safe.


+1.

I don't even see this as much of a problem. Prepared transactions
hanging around for arbitrary periods of time cause all kinds of problems
already. Those using them need to be careful to resolve them quickly --
and if there's a crash involved, I think it's reasonable to say they
should be resolved before continuing normal online operations.


Committed this now. (sorry for the delay)


Hmm, it occurs to me if we have to abort a transaction due to
serialization failure involving a prepared transaction, we might want
to include the prepared transaction's gid in the errdetail.


I like this idea.


+1. Anyone want to put together a patch?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-29 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 23.02.2012 01:36, Jeff Davis wrote:
 On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote:
 
 Hmm, it occurs to me if we have to abort a transaction due to
 serialization failure involving a prepared transaction, we might
 want to include the prepared transaction's gid in the errdetail.

 I like this idea.
 
 +1. Anyone want to put together a patch?
 
Unless Dan claims it before I start the work, I'll do it.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-22 Thread Jeff Davis
On Tue, 2012-02-14 at 19:32 -0500, Dan Ports wrote:
 On Tue, Feb 14, 2012 at 09:27:58AM -0600, Kevin Grittner wrote:
  Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
   On 14.02.2012 04:57, Dan Ports wrote:
   The easiest answer would be to just treat every prepared
   transaction found during recovery as though it had a conflict in
   and out. This is roughly a one-line change, and it's certainly
   safe.

+1.

I don't even see this as much of a problem. Prepared transactions
hanging around for arbitrary periods of time cause all kinds of problems
already. Those using them need to be careful to resolve them quickly --
and if there's a crash involved, I think it's reasonable to say they
should be resolved before continuing normal online operations.

 Hmm, it occurs to me if we have to abort a transaction due to
 serialization failure involving a prepared transaction, we might want
 to include the prepared transaction's gid in the errdetail. 

I like this idea.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-14 Thread Heikki Linnakangas

On 14.02.2012 04:57, Dan Ports wrote:

Looking over the SSI 2PC code recently, I noticed that I overlooked a
case that could lead to non-serializable behavior after a crash.

When we PREPARE a serializable transaction, we store part of the
SERIALIZABLEXACT in the statefile (in addition to the list of SIREAD
locks). One of the pieces of information we record is whether the
transaction had any conflicts in or out. The problem is that that can
change if a new conflict occurs after the transaction has prepared.

Here's an example of the problem (based on the receipt-report test):

-- Setup
CREATE TABLE ctl (k text NOT NULL PRIMARY KEY, deposit_date date NOT NULL);
INSERT INTO ctl VALUES ('receipt', DATE '2008-12-22');
CREATE TABLE receipt (receipt_no int NOT NULL PRIMARY KEY, deposit_date date 
NOT NULL, amount numeric(13,2));

-- T2
BEGIN ISOLATION LEVEL SERIALIZABLE;
INSERT INTO receipt VALUES (3, (SELECT deposit_date FROM ctl WHERE k = 
'receipt'), 4.00);
PREPARE TRANSACTION 't2';

-- T3
BEGIN ISOLATION LEVEL SERIALIZABLE;
UPDATE ctl SET deposit_date = DATE '2008-12-23' WHERE k = 'receipt';
COMMIT;

-- T1
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT * FROM ctl WHERE k = 'receipt';
SELECT * FROM receipt WHERE deposit_date = DATE '2008-12-22';
COMMIT;

Running this sequence of transactions normally, T1 will be rolled back
because of the pattern of conflicts T1 -  T2 -  T3, as we'd expect. This
should still be true even if we restart the database before executing
the last transaction -- but it's not. The problem is that, when T2
prepared, it had no conflicts, so we recorded that in the statefile.
The T2 -  T3 conflict happened later, so we didn't know about it during
recovery.

I discussed this a bit with Kevin and we agreed that this is important
to fix, since it's a false negative that violates serializability. The
question is how to fix it. There are a couple of options...

The easiest answer would be to just treat every prepared transaction
found during recovery as though it had a conflict in and out. This
is roughly a one-line change, and it's certainly safe.But the
downside is that this is pretty restrictive: after recovery, we'd
have to abort any serializable transaction that tries to read
anything that a prepared transaction wrote, or modify anything that
it read, until that transaction is either committed or rolled back.


+1 for this solution.


To do better than that, we want to know accurately whether the prepared
transaction had a conflict with a transaction that prepared or
committed before the crash. We could do this if we had a way to append
a record to the 2PC statefile of an already-prepared transaction --
then we'd just add a new record indicating the conflict. Of course, we
don't have a way to do that. It'd be tricky to add support for this,
since it has to be crash-safe, so the question is whether the improved
precision justifies the complexity it would require.


Not worth the complexity, IMO.

Perhaps it would be simpler to add the extra information to the commit 
records of the transactions that commit after the first transaction is 
prepared. In the commit record, you would include a list of prepared 
transactions that this transaction conflicted with. During recovery, you 
would collect those lists in memory, and use them at the end of recovery 
to flag the conflicts in prepared transactions that are still in 
prepared state.



A third option is to observe that the only conflicts *in* that
matter from a recovered prepared transaction are from other prepared
transactions. So we could have prepared transactions include in
their statefile the xids of any prepared transactions they conflicted
with at prepare time, and match them up during recovery to
reconstruct the graph. This is a middle ground between the other two
options. It doesn't require modifying the statefile after prepare.
However, conflicts *out* to non-prepared transactions do matter, and
this doesn't record those, so we'd have to do the conservative thing
-- which means that after recovery, no one can read anything a
prepared transaction wrote.


This would be fairly simple to do, but I'm not sure it's worth it, 
either. The nasty thing about this is whole thing is precisely that 
no-one can read anything the prepared transaction wrote, so making the 
conflict-in bookkeeping more accurate doesn't seem very helpful.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-14 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 14.02.2012 04:57, Dan Ports wrote:
 Looking over the SSI 2PC code recently, I noticed that I
 overlooked a case that could lead to non-serializable behavior
 after a crash.

 When we PREPARE a serializable transaction, we store part of the
 SERIALIZABLEXACT in the statefile (in addition to the list of
 SIREAD locks). One of the pieces of information we record is
 whether the transaction had any conflicts in or out. The problem
 is that that can change if a new conflict occurs after the
 transaction has prepared.
 
 I discussed this a bit with Kevin and we agreed that this is
 important to fix, since it's a false negative that violates
 serializability. The question is how to fix it. There are a
 couple of options...

 The easiest answer would be to just treat every prepared
 transaction found during recovery as though it had a conflict in
 and out. This is roughly a one-line change, and it's certainly
 safe.
 
Dan, could you post such a patch, please?
 
 But the downside is that this is pretty restrictive: after
 recovery, we'd have to abort any serializable transaction that
 tries to read anything that a prepared transaction wrote, or
 modify anything that it read, until that transaction is either
 committed or rolled back.
 
 +1 for this solution.
 
+1 for 9.2 and backpatching this; with the notion that we might be
able to do better in some later release.  (A TODO entry?)
 
Should we add anything to the docs to warn people that if they crash
with serializable prepared transactions pending, they will see this
behavior until the prepared transactions are either committed or
rolled back, either by the transaction manager or through manual
intervention?
 
 Perhaps it would be simpler to add the extra information to the
 commit records of the transactions that commit after the first
 transaction is prepared. In the commit record, you would include a
 list of prepared transactions that this transaction conflicted
 with. During recovery, you would collect those lists in memory,
 and use them at the end of recovery to flag the conflicts in
 prepared transactions that are still in prepared state.
 
That indeed seems simpler.  I'm not even sure that you would need to
build a list and process it at the end; couldn't this be done as the
commit records are replayed?  Keep in mind that if the prepared
transaction is not still pending, the information can be safely
ignored, and if it *is* still pending you don't need to know *which*
transaction it had the conflict with, because it will certainly have
committed before the start of any post-recovery transaction.
 
 A third option is to observe that the only conflicts *in* that
 matter from a recovered prepared transaction are from other
 prepared transactions. So we could have prepared transactions
 include in their statefile the xids of any prepared transactions
 they conflicted with at prepare time, and match them up during
 recovery to reconstruct the graph. This is a middle ground
 between the other two options. It doesn't require modifying the
 statefile after prepare. However, conflicts *out* to non-prepared
 transactions do matter, and this doesn't record those, so we'd
 have to do the conservative thing -- which means that after
 recovery, no one can read anything a prepared transaction wrote.
 
 This would be fairly simple to do, but I'm not sure it's worth
 it, either. The nasty thing about this is whole thing is precisely
 that no-one can read anything the prepared transaction wrote, so
 making the conflict-in bookkeeping more accurate doesn't seem very
 helpful.
 
Yeah, the benefit of this would be marginal without solving the
other side of the problem; but if we're adding TODO entries for this
area, perhaps they should be two separate entries, because either
side of this could be done without touching the other.
 
To summarize the above discussion, there is a bug that can be hit
when using both SSI and 2PC if a crash or shutdown occurs while any
serializable prepared transactions are pending and certain other
conditions are met.  The proposed quick fix would be to cause a
serialization failure after recovery on any attempt by a
serializable transaction to read data written by a serializable
prepared transaction that was pending when a crash or shutdown
occurred, and on any attempt by a serializable transaction to do a
write which conflicts with a predicate lock acquired by such a
prepared transaction.  This would tend to be more than a little
inconvenient until the prepared statements pending at crash or
shutdown were all committed or rolled back.  A more sophisticated
solution is available that could be implemented in 9.3 or later.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-14 Thread Kevin Grittner
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 This would tend to be more than a little inconvenient until the
 prepared statements pending at crash or shutdown were all
 committed or rolled back.
 
[sigh]
 
Probably obvious, but to avoid confusion:
 
s/prepared statements/prepared transactions/
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-14 Thread Dan Ports
On Tue, Feb 14, 2012 at 10:04:15AM +0200, Heikki Linnakangas wrote:
 Perhaps it would be simpler to add the extra information to the commit 
 records of the transactions that commit after the first transaction is 
 prepared. In the commit record, you would include a list of prepared 
 transactions that this transaction conflicted with. During recovery, you 
 would collect those lists in memory, and use them at the end of recovery 
 to flag the conflicts in prepared transactions that are still in 
 prepared state.

Yeah, doing it that way might be a better strategy if we wanted to go
that route. I hadn't really considered it because I'm not that familiar
with the xlog code (plus, the commit record already contains a variable
length field, making it that much more difficult to add another).

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSI rw-conflicts and 2PC

2012-02-14 Thread Dan Ports
On Tue, Feb 14, 2012 at 09:27:58AM -0600, Kevin Grittner wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
  On 14.02.2012 04:57, Dan Ports wrote:
  The easiest answer would be to just treat every prepared
  transaction found during recovery as though it had a conflict in
  and out. This is roughly a one-line change, and it's certainly
  safe.
  
 Dan, could you post such a patch, please?

Sure. See attached.

 Should we add anything to the docs to warn people that if they crash
 with serializable prepared transactions pending, they will see this
 behavior until the prepared transactions are either committed or
 rolled back, either by the transaction manager or through manual
 intervention?

Hmm, it occurs to me if we have to abort a transaction due to
serialization failure involving a prepared transaction, we might want
to include the prepared transaction's gid in the errdetail. 

This semes like it'd be especially useful for prepared transactions. We
can generally pick the transaction to abort to ensure the safe retry
property -- if that transaction is immediately retried, it won't
fail with the same conflict -- but we can't always guarantee that when
prepared transactions are involved. And prepared transactions already
have a convenient, user-visible ID we can report.

Dan

-- 
Dan R. K. Ports  MIT CSAILhttp://drkp.net/
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index b75b73a..b102e19 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -4733,14 +4733,11 @@ AtPrepare_PredicateLocks(void)
 	xactRecord-flags = MySerializableXact-flags;
 
 	/*
-	 * Tweak the flags. Since we're not going to output the inConflicts and
-	 * outConflicts lists, if they're non-empty we'll represent that by
-	 * setting the appropriate summary conflict flags.
+	 * Note that we don't include the list of conflicts in our out in
+	 * the statefile, because new conflicts can be added even after the
+	 * transaction prepares. We'll just make a conservative assumption
+	 * during recovery instead.
 	 */
-	if (!SHMQueueEmpty(MySerializableXact-inConflicts))
-		xactRecord-flags |= SXACT_FLAG_SUMMARY_CONFLICT_IN;
-	if (!SHMQueueEmpty(MySerializableXact-outConflicts))
-		xactRecord-flags |= SXACT_FLAG_SUMMARY_CONFLICT_OUT;
 
 	RegisterTwoPhaseRecord(TWOPHASE_RM_PREDICATELOCK_ID, 0,
 		   record, sizeof(record));
@@ -4875,15 +4872,6 @@ predicatelock_twophase_recover(TransactionId xid, uint16 info,
 
 		sxact-SeqNo.lastCommitBeforeSnapshot = RecoverySerCommitSeqNo;
 
-
-		/*
-		 * We don't need the details of a prepared transaction's conflicts,
-		 * just whether it had conflicts in or out (which we get from the
-		 * flags)
-		 */
-		SHMQueueInit((sxact-outConflicts));
-		SHMQueueInit((sxact-inConflicts));
-
 		/*
 		 * Don't need to track this; no transactions running at the time the
 		 * recovered xact started are still active, except possibly other
@@ -4905,6 +4893,17 @@ predicatelock_twophase_recover(TransactionId xid, uint16 info,
    (MaxBackends + max_prepared_xacts));
 		}
 
+		/*
+		 * We don't know whether the transaction had any conflicts or
+		 * not, so we'll conservatively assume that it had both a
+		 * conflict in and a conflict out, and represent that with the
+		 * summary conflict flags.
+		 */
+		SHMQueueInit((sxact-outConflicts));
+		SHMQueueInit((sxact-inConflicts));
+		sxact-flags |= SXACT_FLAG_SUMMARY_CONFLICT_IN;
+		sxact-flags |= SXACT_FLAG_SUMMARY_CONFLICT_OUT;
+		
 		/* Register the transaction's xid */
 		sxidtag.xid = xid;
 		sxid = (SERIALIZABLEXID *) hash_search(SerializableXidHash,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers