Re: [HACKERS] [BUGS] BUG #6425: Bus error in slot_deform_tuple

2012-02-05 Thread Simon Riggs
On Sat, Feb 4, 2012 at 6:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 The cause here is data changing underneath the user. Your patch solves
 the most obvious error, but it still allows other problems if applying
 the backup block changes data. If the backup block doesn't do anything
 at all then we don't need to apply it either.

 This is nonsense.  What applying the backup block does is to apply the
 change that the WAL record would otherwise have applied, except we
 decided to make it store a full-page image instead.

Yep, you're right, my bad.

Got a head cold, so will lay off a few days from too much thinking.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #6425: Bus error in slot_deform_tuple

2012-02-04 Thread Simon Riggs
On Fri, Feb 3, 2012 at 6:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 The reason it is relevant
 to our current problem is that even though RestoreBkpBlocks faithfully
 takes exclusive lock on the buffer, *that is not enough to guarantee
 that no one else is touching that buffer*.  Another backend that has
 already located a visible tuple on a page is entitled to keep accessing
 that tuple with only a buffer pin.  So the existing code transiently
 wipes the data from underneath the other backend's pin.

While deciding whether to apply the patch, I'm thinking about whether
we should be doing this at all. We already agreed that backup blocks
were removable from the WAL stream.

The cause here is data changing underneath the user. Your patch solves
the most obvious error, but it still allows other problems if applying
the backup block changes data. If the backup block doesn't do anything
at all then we don't need to apply it either.

So ISTM that we should just skip applying backup blocks over the top
of existing buffers once we have reached consistency.

Patch to do that attached, but the basic part of it is just this...

@@ -3700,8 +3701,21 @@ RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord
*record, bool cleanup)
memcpy(bkpb, blk, sizeof(BkpBlock));
blk += sizeof(BkpBlock);

+   hit = false;
buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork,
bkpb.block,
-
 RBM_ZERO);
+
 RBM_ZERO, hit);
+
+   /*
+* If we found the block in shared buffers and we are already
+* consistent then skip applying the backup block. The block
+* was already removable anyway, so we can skip
without problems.
+* This avoids us needing to take a cleanup lock in
all cases when
+* we apply backup blocks because of potential effects
on user queries,
+* which expect data on blocks to remain constant
while being read.
+*/
+   if (reachedConsistency  hit)
+   continue;
+
Assert(BufferIsValid(buffer));
if (cleanup)
LockBufferForCleanup(buffer);


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #6425: Bus error in slot_deform_tuple

2012-02-04 Thread Simon Riggs
On Sat, Feb 4, 2012 at 6:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Patch to do that attached


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 99a431a..4758931 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4590,6 +4590,7 @@ heap_xlog_clean(XLogRecPtr lsn, XLogRecord *record)
 	int			ndead;
 	int			nunused;
 	Size		freespace;
+	bool		hit;
 
 	/*
 	 * We're about to remove tuples. In Hot Standby mode, ensure that there's
@@ -4608,7 +4609,7 @@ heap_xlog_clean(XLogRecPtr lsn, XLogRecord *record)
 	if (record-xl_info  XLR_BKP_BLOCK_1)
 		return;
 
-	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL);
+	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL, hit);
 	if (!BufferIsValid(buffer))
 		return;
 	LockBufferForCleanup(buffer);
@@ -4664,6 +4665,7 @@ heap_xlog_freeze(XLogRecPtr lsn, XLogRecord *record)
 	TransactionId cutoff_xid = xlrec-cutoff_xid;
 	Buffer		buffer;
 	Page		page;
+	bool		hit;
 
 	/*
 	 * In Hot Standby mode, ensure that there's no queries running which still
@@ -4677,7 +4679,7 @@ heap_xlog_freeze(XLogRecPtr lsn, XLogRecord *record)
 	if (record-xl_info  XLR_BKP_BLOCK_1)
 		return;
 
-	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL);
+	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL, hit);
 	if (!BufferIsValid(buffer))
 		return;
 	LockBufferForCleanup(buffer);
@@ -4728,6 +4730,7 @@ heap_xlog_visible(XLogRecPtr lsn, XLogRecord *record)
 	xl_heap_visible *xlrec = (xl_heap_visible *) XLogRecGetData(record);
 	Buffer		buffer;
 	Page		page;
+	bool		hit;
 
 	/*
 	 * Read the heap page, if it still exists.  If the heap file has been
@@ -4736,7 +4739,7 @@ heap_xlog_visible(XLogRecPtr lsn, XLogRecord *record)
 	 * will have to be cleared out at the same time.
 	 */
 	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block,
-	RBM_NORMAL);
+	RBM_NORMAL, hit);
 	if (!BufferIsValid(buffer))
 		return;
 	page = (Page) BufferGetPage(buffer);
@@ -4806,13 +4809,14 @@ heap_xlog_newpage(XLogRecPtr lsn, XLogRecord *record)
 	xl_heap_newpage *xlrec = (xl_heap_newpage *) XLogRecGetData(record);
 	Buffer		buffer;
 	Page		page;
+	bool		hit;
 
 	/*
 	 * Note: the NEWPAGE log record is used for both heaps and indexes, so do
 	 * not do anything that assumes we are touching a heap.
 	 */
 	buffer = XLogReadBufferExtended(xlrec-node, xlrec-forknum, xlrec-blkno,
-	RBM_ZERO);
+	RBM_ZERO, hit);
 	Assert(BufferIsValid(buffer));
 	LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 	page = (Page) BufferGetPage(buffer);
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 0f5c113..d10b0b8 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -466,6 +466,7 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)
 	Buffer		buffer;
 	Page		page;
 	BTPageOpaque opaque;
+	bool		hit;
 
 	xlrec = (xl_btree_vacuum *) XLogRecGetData(record);
 
@@ -491,7 +492,7 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)
 			 * Another simple optimization would be to check if there's any
 			 * backends running; if not, we could just skip this.
 			 */
-			buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, blkno, RBM_NORMAL);
+			buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, blkno, RBM_NORMAL, hit);
 			if (BufferIsValid(buffer))
 			{
 LockBufferForCleanup(buffer);
@@ -513,7 +514,7 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)
 	 * Like in btvacuumpage(), we need to take a cleanup lock on every leaf
 	 * page. See nbtree/README for details.
 	 */
-	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL);
+	buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, xlrec-block, RBM_NORMAL, hit);
 	if (!BufferIsValid(buffer))
 		return;
 	LockBufferForCleanup(buffer);
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index cce87a3..3f4842d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -3687,6 +3687,7 @@ RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord *record, bool cleanup)
 	BkpBlock	bkpb;
 	char	   *blk;
 	int			i;
+	bool		hit;
 
 	if (!(record-xl_info  XLR_BKP_BLOCK_MASK))
 		return;
@@ -3700,8 +3701,21 @@ RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord *record, bool cleanup)
 		memcpy(bkpb, blk, sizeof(BkpBlock));
 		blk += sizeof(BkpBlock);
 
+		hit = false;
 		buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork, bkpb.block,
-		RBM_ZERO);
+		RBM_ZERO, hit);
+
+		/*
+		 * If we found the block in shared buffers and we are already
+		 * consistent then skip applying the backup block. The block
+		 * was already removable anyway, so we can 

Re: [HACKERS] [BUGS] BUG #6425: Bus error in slot_deform_tuple

2012-02-04 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 The cause here is data changing underneath the user. Your patch solves
 the most obvious error, but it still allows other problems if applying
 the backup block changes data. If the backup block doesn't do anything
 at all then we don't need to apply it either.

This is nonsense.  What applying the backup block does is to apply the
change that the WAL record would otherwise have applied, except we
decided to make it store a full-page image instead.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers